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ABSTRACT 


This  technical  report  summarizes  the  image 
understanding,  smart  sensor,  and  image  processing  research 
activities  performed  by  the  Image  Processing  Institute  at 
the  University  of  Southern  California  during  the  period  of  1 
April  1977  through  30  September  1977  under  Contract  Number 
F-3361 5-76 -C -1203  with  the  Advanced  Research  Projects  Agency 
Information  Processing  Techniques  Office. 

The  research  program  has  as  its  primary  purpose,  the 
development  of  techniques  and  systems  for  understanding 
images.  Five  tasks  are  reported:  Image  Understanding 
Projects,  Image  Processing  Projects,  Smart  Sensor  Projects, 
Recent  Ph.D.  Dissertations,  and  Recent  Institute  Personnel 
Publications.  The  image  understanding  tasks  reported  on 
include  comparison  of  region  growing  versus  boundary 
delineated  segmentation,  analytic  results  on  the  clustering 
segmentor,  development  of  feature  extractors  for  edge 
detection,  circle  detection,  line  detection,  and  texture 
detection  and  higher  level  image  understanding  for  an 
interactive  user  system  as  well  as  a system  for  sharing 
information  between  existing  image  understanding  programs. 
The  image  processing  projects  include  degrees  of  freedom 
analyses  for  radar  images,  blind  a posteriori  phase 
restoration,  psychovisual  modelling  for  image  rate 
distortion  analysis,  and  optical  processing  procedures  for 
on-axis  holographic  filtering  and  optical  pseudocoloring  of 
texture  information.  The  smart  sensor  project  describes 
testing  of  the  Sobel  circuit  and  fabrication  of  the 
spatially  variant  circuit.  Recent  Ph.D.  dissertations  are 
discussed  in  the  following  section,  the  report  concluding 
with  listings  of  recent  publications. 
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1 . Research  Overview 

This  document  represents  the  fourth  semiannual  report 
funded  under  the  current  ARPA  Image  Understanding  contract. 
Research  over  the  past  six  months  (and,  in  fact,  past  two 
years)  has  been  devoted  to  three  major  areas.  The  first, 
and  majority  of  effort,  has  been  in  the  area  of  image 
understanding.  The  second  has  been  the  area  of  smart  sensor 
design,  and  the  third  has  been  the  area  of  research  in  image 
processing. 

Image  Understanding 


This  aspect  of  our  research  has  concentrated  upon  both 
bottom  up  and  heterarchical  methodologies  for  image 
understanding.  Segmentations  based  on  region  growing  and 
boundary  delineation  have  been  compared  to  test  the  strength 
and  weaknesses  of  each.  Analytic  results  of  a specific 
clustering  segmentor  are  developed.  Specific  feature 
extractors  for  edge  detection,  circle  detection,  line 
detection,  and  texture  detection  are  each  investigated  in 
considerable  detail.  The  emphasis  of  approach  is  somewhat 
evenly  distributed  between  the  use  of  mathematical  tools  and 
the  use  of  computer  science  tools.  At  the  higher  levels  of 
the  system  a more  sophisticated  viewpoint  is  developed  for 
the  heterarchical  methods  applied  to  locating  structures  in 
aerial  images.  In  addition  these  high  level  techniques  are 
being  applied  to  the  development  of  an  interactive  user 
system  for  usage  by  Institute  personnel  and  visitors. 
Finally,  the  development  of  an  initial  system  for  sharing 
information  between  existing  image  understanding  programs  is 
underway.  It  is  in  these  latter  directions  that  we  expect 
the  heterarchical  approach  to  provide  fruitful  results. 
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Smart  Sensors 

This  section  represents  the  smart  sensor  phase  of 
research  funded  during  the  past  six  months.  Two  circuits 
have  been  fabricated  for  CCD  analog  near  focal  plane 
processing  to  implement  a variety  of  front  end  image 
processing  functions.  The  first  circuit  implements  the 
Sobel  operator  on  an  image.  This  represents  a nonlinear 
spatially  invariant  processor.  The  second  circuit  is 
designed  to  implement  both  nonlinear  spatially  invariant  as 
well  as  variant  processes.  Both  circuits  have  been 
fabricated.  The  Sobel  circuit  has  been  tested  with  success, 
(see  accompanying  section)  and  the  second  circuit  i.c  soon  to 
experience  testing. 

Image  Processing 

This  section  of  the  report  describes  the  efforts 
expended  and  results  obtained  over  the  past  six  months  on 
the  various  image  processing  projects  carried  out  at  the 
Institute.  Some  of  these  projects  have  been  funded  by  other 
sources  as  indicated  in  the  appropriate  title 
acknowledgements.  Results  of  work  in  defining  the  degrees 
of  freedom  inherent  in  radar  imaging  systems  are  presented 
for  the  stripping  mode  of  SAR.  It  is  expected  that  future 
radar  imaging  model  degrees  of  freedom  analysis  will  lead  to 
efficient  radar  image  understanding,  a task  which  for 
humans,  is  far  more  difficult  than  visual  image 
understanding.  This  section  also  summarizes  the 
psychovisual  modelling  work  being  applied  to  image  coding 
and  image  rate  distortion  functions.  Results  in  blind  a 
posteriori  image  restoration  are  next  presented.  Finally 
two  new  optical  processing  procedures  are  described  in  which 
on-axis  holographic  optical  filtering  is  developed,  and 
optical  pseudocoloring  of  texture  information  is  described. 


Finally  a report  of  recent  Institute  Ph.D. 
dissertations  is  included  as  well  as  the  listing  of  recent 
Institute  personnel  publications  in  the  open  literature. 


2.  Image  Understanding  Projects 

This  aspect  of  our  research  has  concentrated  upon  both 
bottom  up  and  heterarchical  methodologies  for  image 
understanding.  Segmentations  based  on  region  growing  and 
boundary  delineation  have  been  compared  to  test  the  strength 
and  weaknesses  of  each.  Analytic  results  of  a specific 
clustering  segmentor  are  developed.  Specific  feature 
extractors  for  edge  detection,  circle  detection,  line 
detection,  and  texture  detection  are  each  investigated  in 
considerable  detail.  The  emphasis  of  approach  is  somewhat 
evenly  distributed  between  the  use  of  mathematical  tools  and 
the  use  of  computer  science  tools.  At  the  higher  levels  of 
the  system  a more  sophisticated  viewpoint  is  developed  for 
the  heterarchical  methods  applied  to  locating  structures  in 
aerial  images.  In  addition  these  high  level  technigues  are 
being  applied  to  the  development  of  an  interactive  user 
system  for  usage  by  Institute  personnel  and  visitors. 
Finally,  the  development  of  an  initial  system  for  sharing 
information  between  existing  image  understanding  programs  is 
underway.  It  is  in  these  latter  directions  that  we  expect 
the  heterarchical  approach  to  provide  fruitful  results. 


2.1  A Comparison  of  Some  Segmentation  Techniques 


Ramakant  Nevatia  and  Keith  Price 


Segmentation  is,  of  course,  a key  component  in  the 
Image  Understanding  process.  The  numerous  segmentation 
techniques  may  be  viewed  as  being  either  edge  based  or 
region  based.  The  edge  based  techniques  start  by  detection 
of  local  discontinuities  in  some  attribute,  such  as 
brightness  of  an  image  and  attempt  to  construct  object 
boundaries  from  them.  The  region  based  techniques  attempt 
to  find  areas  in  the  image  over  which  one  or  more  attributes 
are  constant. 

It  may  be  that  in  some  sense  the  two  techniques  are 
trying  to  compute  similar  functions  and  that  they  should  be 
capable  of  achieving  similar  performance.  However,  at  the 
present  state  of  development  of  these  methods,  one  or  the 
other  technique  may  be  more  successful  on  certain  kinds  of 
images.  This  is  a subject  of  active  discussion  among 
researchers  in  the  field,  but  we  are  unaware  of  any 
comparative  studies. 

In  the  section,  results  of  processing  two  selected, 
black  and  white  pictures  using  the  two  classes  of  techniques 
are  presented  that  lead  to  some  expected  conclusions  about 
their  suitability  for  different  tasks.  The  edge  based 
technique  is  that  developed  at  the  University  of  Southern 
California  [1-2],  and  the  region  based  technique  is  that  of 
Ohlander  [3],  modified  by  K.  Price  [4],  and  developed  at 
Car negie-Mellon  University. 

A brief  review  of  the  two  segmentation  techniques  used 
is  provided  here. 


a)  An  Edge  Based  Method  - In  this  method,  a local  edge 
operator  is  applied  to  an  image  first.  The  resulting  edges 
are  then  linked  in  straight  line  segments  and  only  segments 
of  a minimum  length  or  above  are  preserved  {for  details  see 
[1]).  It  is  hypothesized  that  such  segments  usually 
correspond  to  the  desired  boundaries. 

The  linking  method  is  independent  of  the  edge  operator 
used.  However,  the  final  performance  is  obviously 
determined  by  the  output  of  the  local  edge  operator.  We 
have  used  a Hueckel  edge  detector  (5]  in  previous 
experiments.  This  edge  detector  is  believed  to  have 
superior  performance  to  many  simpler  edge  detectors,  but  it 
is  not  always  effective  in  the  presence  of  texture.  A 
simple  edge  detector,  which  consists  of  convolving  an  image 
with  elongated  edge  masks  in  various  directions  and  choosing 
the  maximum  was  developed  and  found  to  perform  well  (for 
details,  see  [2]).  This  edge  detector  has  been  used  in 
results  presented  later. 

b)  A Region  Based  Method  - The  Ohlander  segmenter  operates 
by  computing  histograms  of  various  image  attributes  and 
segmenting  the  image  into  regions  with  a certain  range  of 
values  of  an  attribute.  The  attribute  with  the  best 
separation  (a  bimodel  distribution  in  the  histogram)  is 
chosen  for  segmentation.  Originally,  the  method  was 
developed  for  color  images.  We  have  used  only  black  and 
white  images  here  and. only  the  intensity  attribute  was  used. 

This  technique  is  recursively  applied  to  the  segmented 
regions  until  regions  become  too  small  or  cannot  be  further 
segmented  according  to  established  criteria  of  histogram 
separations.  Regions  smaller  than  a selected  size  are 
ignored.  Therefore,  long  thin  regions  which  are  broken  into 
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several  smaller  regions  may  be  lost. 


* 

Exper imental  Results 

The  two  test  images  are  shown  in  figures  la  and  2a. 
Figures  lb  and  2b  show  the  edges  detected  in  the  two  images. 
Figures  lc  and  2c  show  the  regions  detected  by  the 
Ohlander-Pr ice  segmenter.  (Results  of  linking  edges  in  line 
segments  are  not  shown). 

Following  are  some  observations  on  the  relative  merits 
of  the  two  approaches. 

1a)  The  performance  on  the  simpler  picture  of  the  truck  of 

figure  1 is  comparable.  The  edge  segmentation  can  be  more 
sensitive,  as  in  separating  the  front  and  top  surface  of  the 
truck,  but  the  boundary  is  fragmented  into  several  segments. 
Region  methods  always  give  closed  regions,  by  definition, 
which  may  be  easier  to  handle  for  some  types  of  objects  or 
processing.  Note  that  both  methods  fail  to  separate  the 
bush  and  truck  top  surface. 

b)  In  the  more  complex  aerial  picture,  the  edge  technique 
seems  to  extract  linear  features,  such  as  roads,  with  ease, 
whereas  the  region  method  does  the  same  for  parts  of  the 
image  that  are  homogeneous,  for  example,  the  lakes  in  figure 
2.  Note  that  the  major  vertical  road  is  not  extracted  as  a 
region  in  figure  2b,  but  is  only  indicated  by  the  boundaries 
of  the  regions  on  two  sides  of  it. 

c)  The  more  complex  parts  of  the  aerial  pictures  are  not 
adequately  analyzed  by  either  technique,  for  example,  the 
lower  part  of  the  river  of  the  suburban  areas  in  figure  2. 
The  main  difficulty  seesis  to  be  due  to  the  presence  of 
texture  and  fine  detail. 


(a)  Digit'  -ed  image 


Figure  1 . A Truck 
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(b)  Detected  edges  (c)  Segmented  regions 


Figure  2.  An  Aerial  Picture 
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Conclusions 


Interestingly,  the  two  methods  perform  similarly  on 
large  areas  of  the  tested  images.  However,  specific 
structures  are  handled  better  with  one  or  the  other.  The 
clear  implication  is  that  a complete  Image  Understanding 
system  should  utilize  both  depending  on  its  goals.  A 
straightforward  method  is  to  use  a specific  technique  to 
locate  particular  types  of  objects. 

The  two  segmentation  techniques  may  also  be  able  to 
reinforce  each  other  at  the  image  level,  for  example,  using 
regions  to  bridge  gaps  in  boundary  segments  or  to  use 
boundary  segments  to  sub-divide  regions.  We  have  not 
examined  such  interaction  in  depth. 
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2.2  Locating  Structures  in  Aerial  Images 
Ramakant  Nevatia  and  Keith  Price 


I ntroduction 

Analysis  of  aerial  images  is,  in  general,  a complex 
task.  The  reasons  for  such  complexities  are  many  arid 
varied.  A prime  cause  is  the  presence  of  texture  which 
causes  difficulties  for  the  low  level  processes  such  as  edge 
detection  and  segmentation.  Another  source  of  difficulty  is 
that  the  desired  objects  and  structures  may  be  small 
compared  to  the  size  of  a complete  image.  A detailed 
analysis  of  a complete  high  resolution  aerial  image  is 
generally  prohibitive  because  of  the  computational  costs. 

For  many  applications,  however,  a complete  and  general 
analysis  is  unnecessary.  Specific  structures  of  interest 
may  have  special  properties,  known  a priori,  that  allow  for 
their  easy  extraction.  The  problem  of  searching  for  small 
structures  is  helped  by  locating  them  by  their  spatial 
relationships  to  larger,  more  easily  located  structures. 

In  previous  work,  we  compared  two  segmentation 
techniques,  the  edge  based  and  the  region  based  methods,  and 
concluded  that  one  or  the  other  may  be  suited  for  extraction 
of  particular  types  of  structures  [1].  This  describes  our 
initial  attempts  to  use  both  techniques,  taking  advantage  of 
their  respective  strong  points. 
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The  problem  approached  is  that  of  finding  user 
specified  structures  in  aerial  images.  The  user  specifies 
the  properties  useful  for  the  location  of  the  desired 
structure  and  also  of  other  related  structures.  (An 
interactive,  question-answer  dialog  system  is  being 
developed  to  facilitate  interaction  with  a user,  see  (2].) 
This  amount  of  a priori  knowledge  is  likely  to  be  available 
in  many  applications  of  guidance  and  photo-interpretation. 

The  a priori  information  is  stored  as  properties  of 
objects  and  their  relationships  to  each  other,  and  may  be 
viewed  as  constituting  a graph  structure  with  the  objects  as 
nodes  and  relationships  as  arcs.  The  properties  and 
relationships  will,  in  general,  need  to  be  unrestricted. 
Currently,  an  object  is  described  either  by  a collection  of 
line  segments  or  by  its  region  properties.  The  segments  are 
described  by  their  length  and  width.  The  regions  are 
described  by  properties  such  as  brightness  (color)  and 
simple  shape  measures  (area,  perimeter,  ratio  of  area  to 
perimeter  squared,  elongation,  etc.). 

The  relationships  used  are  those  of  relative  locations 
of  the  different  objects  and  the  symbolic  relationships  of 
left,  right,  above  and  below.  Other  relationships  such  as 
symmetry  and  similarity  are  obviously  useful,  but  have  not 
been  implemented. 

Our  representation  and  use  of  knowledge  is  similar  to 
that  described  by  Tenenbaum  [3].  The  principal  difference 
is  in  Tenenbaum1 s use  of  single  pixel  attributes  to  uniquely 
distinguish  objects  (in  a given  context).  We  use  object 
attributes  to  aid  in  the  segmentation  of  the  image  and  then 
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use  the  attributes  of  larger,  segmented  parts  for 
recognition. 

Feature  Extraction  and  Segmentation 

Feature  extraction  and  segmentation  is  guided  by  the 
properties  of  the  desired  objects  to  be  extracted.  Thus,  an 
edge  detection-line  finding  process  is  applied  to  extract 
desired  linear  segments  (such  as  roads)  and  a region 
segmentor  for  extracting  areas  uniform  in  some  property  (for 
example  lakes  and  other  bodies  of  water). 

Consider  the  aerial  image  shown  in  figure  1 (the 
displayed  image  contains  352  x 352  pixels,  an  image  of  twice 
the  resolution  is  also  used  in  the  analysis).  Here,  an 
objective  may  be  to  locate  the  dock  structure  and  perhaps 
some  ships  in  it.  As  this  structure  consists  of  relatively 
small  parts  and  is  complex,  it  may  be  easier  to  extract 
related  structures  such  as  the  river,  the  major  highway  and 
the  lakes  first,  and  use  these  to  concentrate  the  search  for 
docks  to  a smaller  area  of  the  image.  (We  assume  such 
information  is  supplied  by  the  user.  No  attempt  has  been 
made  to  automate  the  strategy  generation  process,  as  in 
[4].) 


Edge  detection  processes  are  appropriate  for  the 
extraction  of  the  desired  roads.  Figure  2 shows  the  results 
of  applying  a Hueckel  edge  detector  [5]  on  the  image  of 
figure  1 and  linking  the  resulting  edge  segments  in 
elongated  segments  [6].  The  road  is  known  to  be  narrow 
enough  that  the  edges  corresponding  to  it  are  of  the  "line" 
type  (as  contrasted  with  a step  edge).  Restricting  the 
linked  edges  to  be  only  of  line  type  results  in  fewer 
segments  (shown  in  figure  2). 
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Figure  1.  An  aerial  picture. 


Figure  2.  Lines  detected 
in  figure  1. 


Figure  3 Dark  regions 

detected  in  figure  1 


The  lakes  and  parts  of  the  river  are  conveniently 
extracted  by  using  the  Ohlander-Pr ice  Segmentor  [ 7 J . It  is 
known  that  the  desired  objects  are  relatively  dark  and 
uniform  in  intensity,  and  the  dark  peak  in  the  intensity 
histogram  should  be  used  for  segmentation.  The  completed 
segmentation  is  shown  in  figure  3. 

Matching  of  Segments 

The  next  step  is  to  match  the  derived  line  segments  and 
regions  with  a model  of  the  image.  This  phase  of  our  work 
is  in  progress  and  experimental  results  are  expected  to  be 
available  soon.  Assuming  that  the  derived  segments  are 
distinctive  enough  to  be  easily  distinguished,  approximate 
location  of  the  dock  structures  can  be  predicted.  Now, 
sensitive  line  detectors  should  help  locate  the  piers  of  the 
dock.  (We  have  found  the  Hueckel  edge  detector  to  be 
deficient  in  locating  small  edges,  perhaps  because  of  the 
large  neighborhood  size  used.  Development  of  more  sensitive 
edge  and  line  detectors  is  being  carried  out  concurrently, 
see  (8].) 

Conclusions 


Some  results  of  processing  a complex,  aerial  image 
using  both  the  line  and  the  region  based  techniques  have 
been  shown.  It  appears  that  the  use  of  simple  techniques, 
specifically  suited  to  particular  objects  in  an  image,  may 
allow  useful  processing  of  rather  complex  images.  This  work 
is  in  initial  stages  of  development  and  the  array  of 
segmentation  attributes  is  limited.  While  it  is  hoped  that 
the  described  techniques  have  general  applicability,  our 
experience  with  real  images  is,  as  yet,  limited. 
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One  major  problem  with  the  several  subsystems  at  the 
Image  Processing  Institute  is  that  there  is  no  way  to  easily 
run  these  programs.  All  these  systems  require  program 
specific  inputs  which  are  meaningful  only  to  the  programmer. 

We  have  implemented  an  initial  system  which  is  intended  to 
partially  eliminate  this  problem.  This  system  asks 
questions  which  are  necessary  to  obtain  the  information 
needed  to  operate  the  program  so  that  it  acquires  the  proper 
information.  The  system  is  then  able  to  initiate  the 
execution  of  the  subsystem  and  give  the  input  that  has  been 
specified  by  the  user's  answers. 

This  program  is  not  developed  to  the  extent  of  systems 
such  as  MYCIN  for  medical  analysis  [1]  or  PROSPECTOR  for  . 

geological  analysis  (2].  This  program  is  intended  to 

provide  an  inexperienced  user  with  a guide  to  the 

information  that  is  required  by  various  programs,  not  deduce 
an  interpretation. 

For  a program,  or  the  set  of'  programs,  a user  must 
provide  a set  of  questions  that  may  be  asked.  Each  of  the 
questions  has  a set  of  valid  responses  with  actions  for  each 
response.  These  questions,  responses,  and  actions  are  given 
in  a file,  not  compiled,  thus  modification  of  the  individual 
questions  or  changing  the  entire  set  of  questions  is 
simplified.  The  actions  may  include  new  questions  which  are 
necessary  because  of  the  answer,  e.g.  if  the  question  is 
about  the  type  of  segments  which  are  desired,  and  the  user 


indicates  "lines"  then  the  questions  dealing  with  region 
based  segmentations  are  not  necessary  and  those  dealing  with 
line  extraction  must  be  asked.  Actions  also  include 
remembering  the  program  inputs  that  are  specified  by  the 
user's  answer.  Some  special  operations  such  as  running  the 
specific  program  are  compiled  rather  than  included  in  the 
question-response  file. 


The  future  goals  for  this,  or  a similar  system,  are  to 
learn  how  a user  would  solve  particular  problems,  how 
certain  types  of  objects  might  be  specified  and  to  allow 
interaction  with  the  several  distinct  subsystems  which  are 
used  in  the  image  understanding  task.  The  user  should  be 
allowed  to  modify  or  add  to  the  legal  questions,  responses, 
or  actions  so  that  the  user  can  request  the  segmentation  of 
a particular  object  after  describing  it  once  rather  than 
describing  the  method  in  detail  each  time.  Currently  a 
small  set  of  questions  are  available  for  segmentation  and 
matching  operations. 
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2.4  An  Initial  System  for  Sharing  Information  Between  Image 
Understanding  Programs 

Keith  Price 


This  is  a report  on  the  initial  efforts  to  allow  easier 
interaction  of  a user  between  several  different  programs 
working  on  the  same  problem.  The  system  allows  the  user  to 
interact  with  several  programs  operating  in  parallel  on  one 
or  more  problems  and  to  share  information  between  programs. 

The  TENEX  system  provides  many  of  the  features 
necessary  for  this  system.  A program  can  create  an  inferior 
fork,  start  another  program  running  in  this  fork,  and  also 
continue  running  itself.  The  program  in  the  inferior  fork 
does  not  require  any  modification  to  be  run  in  this  mode 
since  the  superior  program  controls  the  source  and 
destination  of  all  input  and  output  which  would  usually  go 
through  the  user  terminal.  The  current  system  allows  the 
user  to  send  terminal  input  to  a specific  fork  or  connect  to 
a particular  fork  so  that  all  terminal  input  go  directly  to 
that  fork. 

Sharing  of  a portion  of  the  address  space  to  programs 
and  inferior  forks  is  also  provided  by  the  TENEX  monitor.  A 
program  can  map  certain  pages,  a page  is  the  basic  TENEX 
unit  for  storage  allocation,  of  its  core  area  or  a file  into 
other  pages  in  the  program  address  space.  Additionally, 
pages  in  the  address  space  of  a program  in  a superior  fork 
can  be  mapped  into  the  address  space  of  an  inferior  fork  ot 
from  one  inferior  fork  to  another.  The  mapping  of  pages  in 
the  address  space  of  one  fork  to  the  address  space  of 
another  fork  means  that  references  to  addresses  within  a 
mapped  page,  by  the  program  in  either  fork,  are  references 
to  the  same  actual  location. 
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If  several  programs  are  going  to  share  data  or  other 
results  in  this  manner,  then  it  is  usually  necessary  to 
modify  these  programs,  since  the  shared  data  must  be  placed 
in  a particular  location  in  the  address  space.  Such  data 
can  thus  be  hard  to  access  symbolically,  for  example  as  a 
simple  variable,  but  SAIL  provides  a partial  solution  to 
this  problem,  that  is  the  access  of  this  global  data 
symbolically.  The  RECORD  construct  in  SAIL  has  provisions 
for  the  user  specification  of  the  allocation  method  for 
individual  records,  such  as  determining  the  address  of  these 
records.  Thus  some  record  structures  may  be  allocated  in  a 
global  area  and  referenced  symbolically  by  several  current 
processes.  The  program  in  a superior  fork  can  also  access 
any  address  in  its  inferior  forks.  This  feature  is  more 
useful  as  a de-bugging  tool  than  for  sharing  data. 

The  quest  ion/answer  program  described  elsewhere  in  this 
report  may  be  included  as  a sub-program  in  this  system. 
When  that  program  is  required  to  run  another  system,  it  uses 
the  multiple  process  facilities  built  into  this  system. 
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2.5  Singular  Value  Decomposition  Image  Feature  Extraction 
William  K.  Pratt 


The  singular  value  decomposition  (SVD)  is  a numerical 
technique  of  matrix  transformation  by  which  an  arbitrary 
matrix  can  be  expressed  in  outer  product  form.  SVD 
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expansions  have  been  applied  in  the  solutions  of 
ill-conditioned  sets  of  linear  equations  [1].  There  has 
also  been  recent  application  of  the  SVD  concept  to  image 
restoration  and  coding  [2].  Another  application,  explored 
here,  is  the  use  of  the  SVD  to  extract  descriptive  features 
from  an  image  region  for  purposes  of  classification  or 
analysis. 

Singular  Value  Matrix  Decomposition 


Consider  an  M x N matrix  F of  rank  R.  The  SVD 
transform  of  F is  defined  to  be 

A * - UT  F V (1) 

where  U and  V are  unitary  matrices  and  is  a diagonal 

matrix  whose  diagonal  elements  A^  (1 ) ± A^  (2)  2 •••  2 ^ (**) 
are  called  the  singular  values  of  F.  Since  U and  V are 
unitary  matrices,  the  inverse  transform  yields 

F - U A*  VT  (2) 


The  unitary  matrix  U contains  columns  u>  composed  of 

” T 

eigenvectors  of  the  symmetric  matrix  product  FF  . 
Similarly,  the  columns  vn  of  V are  eigenvectors  of  F F.  The 
defining  relations  are 

2T[  Z lT  ]0  - A (3.) 

YTC  rT  F ]»  - A (3b) 

The  matrix  decomposition  of  eq. (2)  can  also  be  expressed  in 
the  series  form 


z - ^ *%o>  aj  vj 


where  the  sum  is  over  the  rank  R of  F. 


(4) 
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SVD  Image  Feature  Extraction  Concept 


The  singular  values  of  a matrix  can  be  considered  as 
descriptors  or  features  of  the  matrix  elements  and  their 
inter-relationships.  If  the  matrix  is  composed  of  randomly 
chosen  real  numbers,  the  singular  values  will  tend  toward 
equality.  On  the  other  hand,  a highly  structured  matrix 
will  exhibit  a few  dominating  singular  values.  Figure  1 
sketches  the  qualitative  structural  relationship  of  the 
singular  values.  This  observation  forms  the  basis  for 
utilisation  of  the  SVD  as  a means  of  forming  image  features. 
An  SVD  expansion  is  formed  over  a block  of  N x N pixels,  and 
some  measure  of  the  skewness  of  the  singular  value 
distribution  is  formed  to  characterize  the  spatial 
"coherence"  of  the  pixel  values.  This  vague  concept  will  be 
solidified  shortly. 


Deterministic  Properties 


Certain  deterministic  image  patterns  possess  SVD 
expansions  that  can  be  easily  computed.  For  example,  an 
array  of  unit  value  pixels  can  be  generated  by  the  outer 
product  expansion 


1 1 . . . 1 

1 1 . . . 1 


11.. 


. 1 


(5) 


Such  a matrix  is  of  unit  rank,  and  therefore,  possesses  only 
one  non-zero  singular  value.  A matrix  containing 
alternating  vertical  stripes  of  zeroes  can  be  formed  by  the 
single  outer  product  expansion 
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_ 1_ 

.10  11.  . .10 

(6) 


Horizontally  striped  arrays  can  be  generated  by  reversing 
the  positions  of  the  vectors  in  eq.(6).  The  reason  that  the 
arrays  of  eqs. (5)  and  (6)  have  only  one  non-zero  singular 
value  is  that  their  rows  are  linearly  dependent,  in  fact 
identical.  Hence,  any  array  with  repeated  rows  (or  columns) 
will  possess  only  one  non-zero  singular  value.  This  class 
includes  striped  arrays  of  various  periods  and  arrays  with 
single  horizontal  or  vertical  line  strokes. 


A checkerboard  array  of  ones  and  zeroes  can  be 
generated  by  the  sum  of  two  outer  products 


'l 
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10  10.  . . .10 
0 10  1.  . . .01 


10  10  10 
0 1 0 1 ....  o 1 


(7) 


and  therefore,  possesses  two  non-zero  singular  values. 
Finally,  an  N x N identity  matrix  with  ones  down  its  main 
diagonal  and  zeroes  elsewhere  is  a rank  N matrix  and  has  N 
equal  singular  values. 


These  simple  deterministic  examples  illustrate  some 
interesting  points.  First,  the  SVD  expansion  is  invariant 
to  the  period  of  striped  horizontal  and  vertical  patterns. 
In  fact,  a constant  amplitude  array,  which  can  be  considered 
an  array  with  a 100%  duty  cycle  period,  has  the  same  number 
of  non-sero  singular  values  (one)  as  a striped  matrix  with 
single  pixel  stripes.  As  a consequence,  the  SVD  is  clearly 
not  useful  as  a measure  of  periodic  structure.  The  second 
major  point  of  the  examples  is  that  rotation  of  a pattern 
affects  the  number  of  singular  values.  A checkerboard 
pattern  consisting  of  diagonal  stripes  possesses  two 
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non-zero  singular  values,  while  horizontally  or  vertically 
striped  arrays  only  have  one  non-zero  singular  value.  As  an 
even  more  extreme  example  consider  the  N x N diagonal  matrix 
with  a single  diagonal  line.  It  has:  N identical  singular 
values.  But,  an  array  with  a single  pixel  vertical  line 
possesses  one  non-zero  singular  value. 

The  conclusion  of  these  observations  is  that  the 
singular  value  distribution  is  not  well  suited  for  the 
characterization  of  deterministic  line,  spot,  or  shape 
structure.  But  rather,  the  SVD  does  provide  an  indication 
of  the  structural  dependence  between  rows  and  columns  of  a 
pixel  array.  This  property,  as  will  be  shown,  is  extremely 
important  for  the  characterization  of  texture-like  regions 
of  an  image. 

Statistical  Properties 

If  the  image  block  F is  considered  to  be  a sample  of  a 
two-dimensional  random  process  then  the  singular  values  A 
as  defined  by  eq.(l)  will  be  random  variables  since  the  SVO 
is  a linear  transformation.  Attempts  are  underway  to 
determine  the  covariance  matrix  of  the  singular  values  in 
terms  of  the  covariance  function  of  the  image  block. 

Experimental  Results 

Figure  2 contains  two  images  of  natural  texture,  grass 
and  ivy,  along  with  two  artificially  generated  fields 
obtained  by  convolutional  processing  of  two-dimensional 
fields  of  randomly  generated  pixels.  The  subjective  match 
between  the  natural  and  artificial  fields  appears  to  be  in 
reasonable  agreement.  Figures  3 and  4 contain  plots  of  the 
singular  values  extracted  from  16  x 16  pixel  blocks  in  the 
center  of  each  image.  The  distributions  of  the  natural 
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(a)  natural  grass 


(b)  artificial  grass 


(c)  natural  ivy  (d)  artificial  ivy 


Figure  2.  Examples  of  Natural  and  Artificial  Texture. 
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SINGULAR  VALUE  AMPLITUDE,  ,/MjT 


Figure  3.  Singular  Values  of  Natural  and  Artificial  Grass 
Texture . 
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SINGULAR  VALUE  AMPLITUDE,  </X(j) 


between  the  actual  and  the  ideal  template  in  the  decision 
procedure.  Although  this  assumption  is  correct  for  the 
exact  Hueckel's  operator,  two  basic  approximations  have 
their  effect  on  actual  results.  The  first  is  the 
discretization  procedure  in  which  a continuous  picture  is 
replaced  by  discrete  intensities  at  sampling  points.  The 
second  is  the  use  of  a finite  number  of  Fourier  coefficients 
in  the  optimization  procedure.  In  this  paper  the  effect  of 
these  two  approximations  is  discussed.  In  addition,  some 
advantages  of  the  Hueckel  operator  are  explained. 

Effect  of  Continuous  to  Discrete  Mapping 

This  effect  is  pronounced  in  two  points.  The  first  is 
that  a circular  disk  cannot  be  represented  in  the  discrete 
domain.  The  second  point  is  that  the  orthogonal  bases  used 
in  the  optimization  procedure  are  derived  in  the  continuous 
domain,  and  orthogonality  is  not  exactly  valid  in  the 
discrete  domain.  Although  Hueckel  has  used  averaging 
methods  to  reduce  the  previous  effects,  it  is  not  clear  that 
his  procedure  can  be  better  than  another  operator  designed 
completely  in  the  discrete  domain. 

The  Effect  of  Using  Finite  Number  of  Coefficients 

Hueckel  has  used  only  nine  coefficients  in  the 
optimization  procedure.  This  choice  was  based  on  the  fact 
that  high  frequency  components  are  generally  the  result  of 
noise.  To  determine  the  effect  of  neglecting  high  frequency 
components,  simple  edges  and  lines  are  reconstructed  using 
the  first  nine  components  only.  Results  are  shown  in  figure 
1.  It  appears  that  the  resulting  images  differ  from  the 
original,  especially  in  the  case  of  thin  lines.  This 
affects  the  optimality  of  the  procedure.  In  addition  it  is 
not  clear,  in  the  case  of  finite  number  of  components,  that 
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grass  and  ivy  are  seen  to  be  significantly  different,  but 
the  distributions  of  the  natural  and  artificial  pairs  are 
quite  close. 

Summary 

The  results  presented  are  quite  preliminary,  but  very 
encouraging.  The  SVD  singular  value  distribution  does 
indeed  seem  to  be  a viable  means  of  characterizing  spatial 
structure  within  an  image  block.  Further  work  is  underway 
to  analyze  the  statistical  properties  of  the  singular  values 
and  to  develop  a quantitative  measure  of  the  singular  value 
distribution. 
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2.6  Some  Comments  on  the  Hueckel  Operator 
Ikram  E.  Abdou 


I ntroduction 

The  Hueckel  operator  is  one  of  the  classical  methods 
for  edge  detection  [1,2].  It  has  been  considered  as  an 
optimum  method  [3]  because  it  uses  the  mean  square  error 
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(a)  Edge  at  center 


(b)  Edge  at  7 


(c)  Line  width  - 3 (d)  Line  width 


Figure  1.  • Original 

+ Reconstruction 


equal  weighing  of  the  error  in  different  components  result 
in  optimum  response.  Other  methods  are  to  use  different 
weights,  or  even  to  base  the  decision  on  each  component 
separately. 

Advantages  of  Hueckel * s Operator 

It  has  been  shown  that  the  Hueckel  operator  is  not 
optimum,  in  the  sense  that  there  may  exist  other  operators 
which  have  better  mean  square  error  performance.  However 
Hueckel 's  operator  has  some  nice  features  which  can  be  used 
in  other  operators. 

The  first  is  that  Hueckel's  performance  is  less 
sensitive  to  parameter  variation.  Figure  2 shows  a typical 
relation  between  the  Hueckel  parameter  Diff  and  the  figure 
of  merit.  The  performance  is  almost  constant  over  a wide 
range  of  Diff.  This  feature  is  useful  in  practical 
applications  where  it  is  difficult  to  change  parameters  for 
each  image. 

The  same  feature  can  be  achieved  in  simple  masks  if  the 
decision  procedure  is  based  on  a nonlinear  function  of  the 
signal  and  noise.  Some  of  results  obtained  are  shown  in 
figure  3. 

The  second  advantage  of  Hueckel's  operator  is  the  use 
of  a large  mask.  This  reduces  the  effect  of  noise  in  case 
of  low  SNR.  However,  large  masks  reduce  the  resolution  of 
the  operator,  and  the  optimum  size  should  be  a compromise 
between  SNR  and  resolution. 
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Figure  2.  Variation  of  figure  of  merit 
with  edge  detector  parameters 
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Conclusion 


In  this  paper  some  of  the  features  of  Hueckel's 
operator  are  discussed.  It  appears  that  the  operator  is  not 
a standard  method  to  which  other  operators  should  be 
compared.  However  it  has  its  own  advantageous  properties 
which  can  be  extended  successfully  to  other  edge  detectors. 
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2.7  Development  of  Edge  Detectors  for  the  Extraction  of 
Linear  Segments 

Peter  Chuan 


Numerous  edge  detectors  have  been  suggested  in  the 
past.  However,  it  is  not  clear  as  to  which  technique  is 
most  useful  under  different  given  conditions.  Evaluation  of 
edge  detectors  on  real  pictures  is  difficult  and  subjective 
in  nature  because  a model  for  the  image  does  not  exist. 
Such  evaluation  cannot  be  separated  from  the  goals  of  the 
processing.  Here  we  describe  an  approach  to  the  detection 
of  edges  that  correspond  to  long,  linear  boundaries. 

One  often  used  edge  detector  is  due  to  Hueckel  (1]  and 
has  been  extensively  used  in  some  of  our  previous  work.  For 
the  purpose  of  detecting  long,  linear  edges,  its  performance 
has  been  poor  in  the  presence  of  noise.  Figures  la  and  lb 
show  a noisy  image  and  the  corresponding  Hueckel  edges, 
respectively.  Here,  the  Hueckel  operator  is  shown  to  be 
indiscriminately  sensitive  to  all  kinds  of  edges. 

This  calls  for  a particular  edge  detector  tailored  to 
be  selectively  sensitive  to  long,  linear  edges.  Some 
crudely  constructed  edge  detectors  were  used  in  the  past  [2] 
and  have  shown  encouraging  results.  Six  directional  edge 
masks  angularly  distributed  in  approximately  30  degrees  were 
convolved  with  the  picture  of  interest.  An  approximated  60 
degree  edge  mask  is  shown  in  figure  2.  The  maximum  among 
the  outputs  of  the  six  directional  edge  masks  is  taken  and 
thresholded  to  show  strong  edges  only.  One  such  result 
obtained  from  figure  la  is  shown  in  figure  3.  Because  the 
■asks  are  binary  (excluding  0)  masks,  gross  approximations 
occur  both  in  angular  orientation  and  in  spatial  linearity 
of  the  edge  line.  Moreover,  such  edge  masks  are  difficult 
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Figure  la.  Original  image 


Figure  lb.  Hueckel  edges 
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Figure  2.  A grossly  approximated  binary  (excluding 
0)  mask  with  an  approximately  straight 
edge  of  60  degrees. 


Figure  3.  Output  from  long,  linear  edge  detector 
of  the  kind  shown  In  figure  2. 
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to  construct  and  for  a finite  size  mask,  only  a few 
directional  edge  masks  may  be  possible. 


Instead  of  approximating  edge  masks  of  a desired  angle 

with  binary  numbers,  a method  is  described  for  the 

construction  of  these  directional  edge  masks.  The  ideal 

case  for  these  particular  long  edge  masks  is  to  have 

infinitely  fine  resolution  edge  masks  so  that  they  can  be 

considered  as  continuous  signals.  In  the  continuous  domain, 

masks  of  any  angle  and  shape  can  be  approximated  as  closely 

as  desired  as  shown  in  figure  4.  This  continuous  mask 

M ( x ,y ) can  be  convolved  with  the  continuous  picture  F (x,y) 
c c 

to  produce  strong  signals  where  the  edges  of  F (x,y)  are 

c 

long  and  continuous.  This  convolved  output  can  be  sampled 
and  quantized  to  produce  our  ideal  discretized  edge  signal 
Ecc(m,n).  For  clarity,  this  process  is  illustrated  in 
figure  5a. 

Since  we  only  have  the  sampled  and  quantized  version 
Fjfnwn)  of  the  continuous  picture  Fc(x,y)  to  work  with,  the 
ideal  situation  can  only  be  approximated.  The  continuous 
edge  mask  Mc(x,y)  is  therefore  sampled  and  quantized  and  the 
discrete  output  Md(m,n)  is  convolved  with  F^(m,n)  instead, 
to  produce  Edd(m,n).  Again  for  clarity,  this  process  is 
illustrated  in  figure  5b. 

The  price  paid  for  using  this  approximate  process 
versus  the  use  of  binary  edge  masks  (figure  2)  is  in  the 
greater  number  of  quantization  levels  needed  for  the  edge 
masks.  However,  the  advantage  gained  is  that  this  technique 
will  allow  the  generation  of  directional  edge  masks  of  any 
desired  size  and  orientation,  with  the  possible  application 
being  the  confirmation  of  the  presence  or  absence  of  an  edge 
with  a certain  direction  or  shape.  Moreover,  this  model 
also  opens  up  the  possibility  of  an  analytical  evaluation  of 
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Figure  5a.  Ideal  case  for  edge  detection. 
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Figure  5b.  Model  for  actual  edge  detection. 
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edge  masks.  Figure  6a  shows  a 60  degree  edge  mask  generated 
through  this  model.  This  directional  edge  mask  has  weights 
obtained  by  integrating  the  continuous  edge  masks  (shown  in 
figure  6b)  over  an  area  covered  by  each  square  pixel.  The 
continuous  edge  mask  is  two  pixels  wide  on  each  side. 
Integration  is  carried  out  over  each  square  pixel  because 
this  is  the  same  process  that  will  be  carried  out  if  the 
continuous  mask  were  physically  digitized  by  an  Optronics 
scanner.  Figure  7 shows  the  edge  of  an  airport  picture 
obtained  by  applying  masks  generated  by  this  model. 

One  consideration  in  edge  detection  problems  is  on 
deciding  the  appropriate  size  of  edge  mask  to  use.  Given  a 
picture  with  small  signal  to  noise  ratio,  larger  masks 
should  be  allowed  for  detecting  edges.  Depending  on  the 
Fourier  bandwidth  (e.g.  the  closeness  of  two  edges)  of  the 
information  in  the  picture,  the  maximum  mask  size  can  be 
determined  that  will  average  out  the  noise  but  retain  most 
of  the  signals.  For  example,  comparing  figures  8a  and  8b, 
the  mask  with  the  size  shown  in  broken  lines  will  work  fine 
in  picking  out  edges  in  figure  8a  but  will  obviously  give 
weaker  edges  in  between  the  two  circles  in  figure  8b.  Masks 
of  two  different  sizes  have  been  applied  on  a test  pattern 
with  signal  to  noise  ratio  (Signal  Energy/Variance  of  Noise) 
equal  to  2.7  and  on  the  airport  picture.  Results  are  shown 
in  figure  9. 

Square  masks  have  been  used  conventionally  for 
directional  edge  masks.  This  is  illustrated  in  figure  10. 
Convolutional  outputs  from  each  of  these  individual  masks 
were  then  compared  to  extract  the  maximal  value.  It  should 
be  realized  that  mask  1 and  mask  2 not  only  have  different 
edge  directions  but  as  a result  of  this,  they  also  have 
different  shapes.  Comparing  mask  1 with  mask  2 would  be 
equivalent  to  comparing  matched  filter  outputs  using  two 
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Figure  6a.  A] 


35% 


Approximated  60  edge  mask 
(9x9)  generated  through  the 
model . The  shaded  region 
contains  negative  weights. 

pixels 


circular 
boundary 
of  edge 
mask 


Figure  6b.  Continuous  edge  mask. 

Shaded  region  contains  0 
weight,  positive  region  is 
weighted  1,  negative  region 
la  weighted  -1. 
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Figure  7.  Edges  obtained  from  circular 
weighted  masks  obtained  from 
our  proposed  model. 


Figure  8.  The  square  boxes  show  the  same  edge  detecting  mask. 

This  mask  is  applied  on  both  figures  (a)  and  (b) . 

At  the  location  indicated  by  the  position  of  the 
mask,  the  edge  output  from  figure  (b)  will  be 
lower  than  tne  edge  output  from  figure  (a) . 
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(a)  Edges  from  5x5 
circular  weighted 
masks  on  test 
pattern  (SNR-0.36) 
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(b)  Edges  from  9x9 
circular  weighted 
masks  on  test 
pattern 


(d)  Edges  from  9x9 
circular  weighted 
masks  on  airport 
picture 


Figure  9.  Effect  of  using  masks  with  sizes  5x5  and  9x9. 
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Figure  10.  Sauare  masks  with  zero  entries  only  along  the 
edge  line. 
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Figure  11a.  Shaded  region 
is  weighted  0 


Figure  lib.  Shaded  region 
is  weighted 
negative 
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Figure  11c.  Square  mask 
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different  signal  detectors,  which  is  not  really  justifiable. 
With  small  mask  sizes,  these  effects  do  not  show  marked 
differences  but  for  larger  masks,  it  is  possible  that  these 
effects  will  emerge  significantly.  A reasonable  correction 
for  such  effects  is  to  construct  circularly  shaped  edge 
masks  and  to  generate  the  sampled  quantized  version  of  the 
continuous  mask.  An  example  of  this  circular  mask  is  shown 
in  figure  11. 

7x7  square  and  circularly  bounded  masks  (listed  in 
figure  13)  have  been  used  to  show  the  effect  of  the 
different  mask  shapes.  This  is  illustrated  in  figure  12. 
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2.8  Circle  Detection  in  Noisy  Images 
Kenneth  I.  Laws 


This  paper  presents  two  methods  of  identifying  circles 
within  a noisy  or  textured  image.  A variant  of  the  Hough 
transform  method  is  outlined,  then  a new  method  for  directed 
edge  elements  is  developed  and  demonstrated. 
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Figure  12a, 12c  Thraahoidad  output  from  applying  square  masks. 
12b, 12d  Thrssholdad  output  from  applying  circularly 
bounded  masks. 

The  edges  in  (c)  and  (d)  have  been  obtained  by 
applying  a thinning  procedure. 
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(a)  7x7  Square  edge  masks 


ANGLE 

OF  EDGE  DIRECTION 

90  0 

0 

41 

89 

0 

-89 

-41 

e 

41 

100 

100 

0 

-100 

-100 

-41 

89 

100 

100 

0 

-100 

-100 

-89 

100 

100 

100 

0 

-160 

-100 

-tee 

89 

100 

100 

0 

-100 

-100 

-89 

41 

100 

100 

0 

-100 

-100 

-41 

0 

41 

89 

0 

-89 

-41 

0 

ANGLE 

OF  EOGE  DIRECTION 

60  0 

0 

41 

69 

100 

89 

-25 

0 

41 

100 

180 

100 

32 

-100 

-41 

89 

100 

100 

92 

-78 

-100 

-89 

100 

100 

100 

0 

-100 

-100 

-100 

89 

ICO 

78 

-92 

-100 

-100 

-89 

41 

100 

-12 

-10© 

-100 

-IOC 

-41 

0 

25 

-89 

-100 

-89 

-41 

0 

ANGLE 

OF  EDGE  C 

'1RECT10N 

30  0 

0 

41 

89 

100 

89 

41 

e 

41 

100 

100 

100 

100 

100 

25 

89 

100 

100 

100 

78 

-32 

-89 

100 

100 

92 

8 

-92 

-100 

-108 

09 

32 

-78 

-100 

-100 

-100 

-89 

-25 

-100 

-100 

-100 

-100 

-100 

-41 

0 

-41 

-89 

-100 

-89 

-41 

8 

ANGLE 

OF  EOGE  DIRECTION 

0 0 

0 

41 

89 

100 

89 

41 

0 

41 

188 

180 

108 

188 

188 

41 

89 

188 

180 

100 

180 

tee 

89 

0 

0 

8 

0 

8 

8 

8 

-89 

-180 

-188 

-180 

-188 

-188 

-00 

-41 

-100 

-188 

-108 

-188 

-188 

-41 

0 

-41 

-89 

-188 

-89 

-41 

0 

ANGLE 

OF  EDGE  DIRECTION 

■38  0 

0 

41 

89 

188 

09 

41 

0 

25 

100 

188 

188 

188 

188 

41 

-89 

-32 

78 

108 

188 

188 

89 

-100 

-100 

-92 

8 

92 

188 

188 

-89 

-100 

-188 

-188 

-70 

32 

89 

-41 

-188 

-188 

-188 

-188 

-188 

-25 

0 

-41 

-89 

-108 

-09 

-41 

0 

ANGLE 

OF  EDGE  DIRECTION 

>68  0 

0 

-25 

89 

180 

89 

41 

8 

-41 

-188 

32 

188 

188 

188 

41 

-8? 

-100 

-78 

92 

188 

188 

89 

-180 

-188 

-188 

8 

188 

188 

188 

-89 

-188 

-180 

-92 

78 

188 

89 

-41 

-100 

-100 

-188 

-32 

188 

41 

0 

-41 

-89 

-188 

-89 

25 

0 

(b)  7x7  Circular  edge  masks 


Figure  13.  Edge  masks  used  in  figure  12. 
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A circle  is  characterized  by  its  uniform  boundary.  The 
form  of  a circle  may  be  perceived  by  humans  even  if  the 
interior  region  is  not  uniform.  It  seems  appropriate  to 
search  for  circles  with  edge  detection  techniques  rather 
than  region  growing  methods. 

A circle,  or  even  a circular  segment,  is  an  extended 
entity.  No  local  property  of  an  edge  element  is  sufficient 
to  determine  whether  it  is  part  of  a circle.  Curvature 
might  be  a sufficient  local  property,  but  it  is  not 
available  from  the  standard  edge  detectors.  It  is  the 
relationship  among  edge  points  which  constitutes  a circle. 

The  edge  elements  must  be  equidistant  from  some  center  and 
the  direction  of  each,  if  known,  must  be  perpendicular  to  a 
line  through  that  center. 

True  circles  are  also  characterized  by  local 
continuity.  If  the  images  of  circular  objects  were 
continuous  there  would  be  little  difficulty  in  tracing  the 
boundaries.  Several  processes  interfere,  however.  The 
object  itself  may  be  slightly  irregular,  and  the  image  may 
be  corrupted  by  noise.  Conversion  to  digital  image  form 
introduces  quantization  error.  Finally,  edge  detectors  can 
be  applied  to  the  image  at  only  a finite  number  of  points, 
and  may  be  confused  by  texture  in  the  image. 

Fitting  Circles  Through  Directed  Edge  Elements 

Most  edge  detectors  identify  the  direction  of  maximum 
gradient  at  each  edge  point.  Adding  90  degrees  to  this 
angle  gives  the  edge  direction,  d.  The  edge  element  (x,y,d) 
is  thus  a short  directed  line  segment  at  (x,y),  with  the 
image  known  to  be  dark  on  one  side  and  bright  on  the  other. 

If  the  image  contains  a bright  disk,  the  edge  elements  will 
circle  it  in  a counter-clockwise  (or  positive)  direction. 
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Circles  through  a given  (x,y,d)  have  centers  (a,b)  on 
the  line  a cos(d)  + b sin(d)  * p,  where 
p * x cos(d)  + y sin(d) . Figure  la  shows  this  locus  in  X-Y 
space.  This  parameterization  may  be  made  unique  either  by 
restricting  p to  be  positive  or  d to  be  between  0 and  180 
degrees. 

The  Hough  transform  method  [1,2]  can  be  used  to  locate 
image  circles.  Each  edge  element  generates  a line  in  A-B-R 
space.  The  locus  consists  of  points 
(x  + r sin(d) ,y  - r cos(d),r),  where  the  sign  of  r 
corresponds  to  the  circle  direction.  If  r is  restricted  to 
positive  values,  the  locus  also  includes  the  line 
(x  - r sin(d) ,y  + r cos(d),r).  If  edge  directions  are  not 
known  the  locus  of  an  edge  point  is  a cone  in  A-B-R  space. 

The  A-B-R  space  is  quantized  to  form  an  accumulator 
array.  For  each  edge  element  the  counters  along  the 
transform  locus  are  incremented.  Cells  with  high  counts 
then  correspond  to  image  circles. 

The  above  method  requires  the  search  of  a 
three-dimensional  transform  space  for  clusters.  The  space 
must  be  quantized  finely  enough  to  provide  one  cell  for  each 
possible  circle  in  the  image.  A more  direct  method 
utilizing  pairs  of  edge  elements  will  now  be  developed.  It 
avoids  quantization  errors  by  searching  for  clusters  in  a 
continuous  space. 

A circle  is  determined  by  two  of  its  directed  edge 
elements,  as  shown  in  figure  lb.  A space  with  N edge 
elements  contains  N(N-l)/2  possible  circles  — one  for  each 
pair  of  points.  A typical  picture  may  have  several  thousand 
edge  points,  or  millions  of  hypothesized  circles.  Simple 
screening  can  reduce  this  to  a manageable  number  of 
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possibilities. 


Preliminary  screening  is  aimed  at  reducing  the  number 
of  edge  elements.  It  may  be  possible  to  eliminate  some 
areas  of  the  image  or  to  consider  only  high  contrast  edges. 
Edge  elements  which  have  been  linked  into  line  segments  may 
sometimes  be  excluded  from  further  consideration. 

A distance  criterion  helps  quickly  screen  out  edge 
pairs.  The  distance  between  a pair  of  points  should  not 
exceed  Dmax,  the  largest  diameter  being  sought.  This 
parameter  is  critical  since  further  processing  time  is 
proportional  to  the  square  of  Dmax.  Larger  circles  can  be 
found  by  line  detection  techniques. 

An  angle  criterion  is  also  useful.  If  two  edge 
elements  are  nearly  parallel  it  is  difficult  to  find  the 
corresponding  circle  parameters.  The  pair  should  be  skipped 
if  the  absolute  value  of  either  d2  - d^  or  d2  - (d^  + 180 
degrees)  is  less  than  some  minimum  angle. 

The  circle  parameters  determined  by  two  points  are 
a - (p1sin(d2)  - p2sin(d1))  / DET 
b - (p2cos(d1)  - p1cos(d2))  / DET 


where 


Pi  - Xicos(di)  yiSin(di), 

DET  » cos(di) sin(d2)  - cos(d2) sinCd^) . 

The  radius  may  be  computed  separately  for  each  edge  point: 
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r[  » (a  - x^)2  + (b  - yt)2 . 

If  the  two  values  differ  greatly,  the  two  edge 
elements  cannot  be  from  the  same  circle.  This  fact  may  be 
used  to  screen  the  hypothesized  circle  centers  and 

considerably  reduce  the  number  which  must  be  processed 
further . 

The  direction  of  rotation  of  each  edge  element  is 
another  important  parameter.  The  angle 

d^  - arctan(yt  - b)  / (x^  - a) 

will  equal  +90  degrees  for  positive  circles  and  -90  degrees 
for  negative  circles.  Two  edge  points  do  not  belong  to  the 
same  circle  if  they  have  opposite  rotations  about  the  common 
center.  For  line  drawings  these  directions  are  not  defined. 
Even  if  defined,  it  may  be  desirable  to  ignore  them  since 
disks  do  not  always  appear  against  uniformly  darker  or 
lighter  backgrounds. 

The  identification  of  image  circles  has  now  been 
reduced  to  a standard  clustering  problem.  The  parameter 
points  corresponding  to  an  image  circle  should  form  globular 
clusters,  possibly  elongated  in  the  R dimension.  There  is  a 
relationship  between  the  radius  of  an  image  circle  and  the 
number  of  edge  pairs  it  produces,  but  this  serves  only  as  an 
upper  bound  on  cluster  membership  if  circular  segments  are 
also  being  sought. 

Many  algorithms  exist  for  finding  globular  clusters. 
The  chosen  method  must  be  insensitive  to  a large  number  of 
outliers  or  noise  points.  The  method  should  also  be  fast 
and  use  little  memory.  It  should  identify  the  number  of 
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clusters  a posteriori,  but  need  not  do  so  hierarchically. 
The  clusters  found  should  be  independent  of  the  order  in 
which  points  are  given,  although  this  is  not  critical. 

Experimental  Resul ts 

Figure  2a  shows  an  armored  personnel  carrier  against  a 
desert  background.  Figure  2b  shows  the  edge  elements 
identified  by  a Hueckel  operator  [31.  This  data  base  was 
taken  as  the  starting  point  for  the  circle  location  problem. 
The  image  scale  is  considered  to  be  256  pixels  in  each 
direction. 

The  edge  map  contains  1,916  edge  elements.  There  are 
thus  1,834,570  pairs  of  edge  points  to  be  screened.  Figure 
2c  shows  the  X-Y  positions  of  the  523  parameter  points  found 
by  screening  for  49  seconds  on  a PDP-10  KI.  Edge  pairs  more 
than  15  pixels  apart  or  deviating  less  than  60  degrees  from 
each  other  were  not  considered.  Radii  computed  for  the  two 
elements  had  to  be  within  10  percent  of  each  other.  The 
direction  of  rotation  had  to  be  the  same  for  each  element, 
and  was  used  to  form  the  sign  of  the  radius  value. 

Clustering  was  then  done  in  the  A-B-R  space  using  a 

Euclidean  distance  measure  and  a variant  of  Wishart's 

convergent  K-Means  algorithm  for  a variable  number  of 

clusters  [4].  A circle  with  a radius  of  r pixels  should 

2 

produce  a cluster  of  approximately  r parameter  points, 

depending  upon  the  screening  criteria.  Clusters  were  only 

kept  if  they  contained  more  than  three  points  and  more  than 
2 

0.25  * r . Five  clustering  iterations  were  used,  taking  15 
seconds  of  computing  time.  Figure  2d  is  a plot  of  the  ten 
circles  corresponding  to  cluster  centroids.  The  reference 
lines  in  the  figure  were  found  independently  by  the  method 
of  Nevatia  [51 . 
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(a)  Original  Image 


(b)  Edge  Data 
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(c)  Hypothesized  Circle  Centers  (d)  Accepted  Circles 


Figure  2.  Experimental  Results 


The  practical'  limit  of  15  pixels  for  Dmax  is  adequate, 
since  larger  circles  may  be  found  by  line  detection  and 
linking  techniques.  Better  screening  may  be  possible  using 
edge  strength  or  even  curvature  information.  Experiments 
indicate  that  removal  of  straight  segments  from  this  data 
base  greatly  reduces  the  circle  finding  time,  but  larger 
circles  are  missed.  Removal  of  edges  belonging  to  circles 
might  similarly  aid  line  finding  programs. 


Without  reference  back  to  the  edge  data  the  circles  can 
only  be  classified  as  complete  and  incomplete.  Locating  the 
corresponding  edge  elements  is  a simple  best-fit  search 
problem.  Each  set  of  points  along  a circle  must  be  sorted 
or  linked,  then  checked  for  continuity.  Some  circles,  such 
as  the  vehicle  wheels,  will  be  dense  and  complete.  Circles 
corresponding  to  rounded  corners  can  be  identified  as  short 
circular  segments.  Others,  such  as  the  bushes  in  the  upper 
left  corner  of  figure  2,  may  be  identifiable  as  noise 
circles  because  of  their  lack  of  continuity. 
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2.9  Analytic  Results  of  the  Coleman  Segmentor 
Harry  C.  Andrews 


Automatic  bottom  up  human  unassisted  image  segmentation 
has  been  developed  by  Coleman  [1]  for  the  Image 
Understanding  program.  The  system  utilizes  pattern 
recognition  techniques  in  N dimensional  vector  space  to 
perform  decorrelation,  clustering,  feature  rejection  and 
ultimate  segmentation.  The  only  underlying  assumption  for 
the  process  is  that  homogeneous  clusters  in  N space  are 
representative  of  homogeneous  regions  of  an  image  in 
perceptual  space.  The  system  is  designed  to  operate  with- 
any  set  of  computable  features  and  will  automatically  select 
the  best  subset  of  those  features  to  develop  tightly 
clustered  homogeneous  regions  in  N space  which  then  serve  to 
define  the  segmentation  of  the  original  image.  In  the 
interest  of  smart  sensor  implementation,  the  system  has  been 
designed  for  frame-to- frame  segmentation  for  real  time 
television-like  sensors. 

Segmentor  Configuration 

Figure  1 presents  a block  diagram  representative  of  the 
system  design  of  the  segmentor.  The  first  component  of  the 
system  is  the  "feature  computation"  phase.  This  process 
computes  the  features  that  the  designer  feels  will  be 


Figure  1.  Segmentor  Configuration 


relevant  for  effective  clustering.  Essentially,  features 
are  computed  up  to  as  high  a resolution  as  at  every  pixel  if 
desired.  Because  the  features  to  be  computed  are  defined  by 
the  user,  it  is  at  this  phase  that  human  intuitive  and 
design  processes  are  brought  to  bear  on  the  segmentation 
problems.  Once  the  human  defined  features  are  computed,  the 
system  then  becomes  automatic  for  subsequent  optimization. 
The  computed  features  (N  of  these)  then  define  an  N 
dimensional  coordinate  system  wherein  each  pixel  will 
subsequently  represent  a point  in  N space.  Typical  features 
that  might  be  computed  are  listed  in  table  1.  Brightness 
amplitudes  for  monochrome,  color,  and  multispectral  scenes 
are  obvious  candidate  features.  Texture  features  might  be 
delineated  by  edges  or  other  spatial  frequency  processors 
and  are  listed  in  the  table.  Finally,  nonlinear  spatial 
filtering  processes  might  also  be  useful  for  segmentation 
and  this  class  of  features  is  listed  as  well.  Obviously,  as 
humans  we  can  continue  to  generate  more  features  as  we 
become  more  familiar  with  our  processing  goals.  The  only 
point  to  be  made  here  is  that  the  feature  computation  box 
will  only  be  as  clever  as  its  designer.  Subsequent  to  this 
phase,  all  processes  become  automatic.  However,  note  how 
simple  it  is  to  generate  fairly  large  dimensional  vector 
spaces  at  the  front  end  of  the  system.  It  is  because  of 
man's  propensity  to  generate  so  much  data  that  subsequent 
optimization  and  feature  rejection  procedures  must  be 
developed  to  efficiently  and  economically  process  such  data 
for  ultimate  segmentation  purposes. 

Returning  to  figure  1 we  see  that  the  next  phase  of  the 
segmentor  configuration  is  a straightforward  vector  space 
rotation  (unitary  transformation)  defined  by  the 
eigenvectors  of  the  overall  covariance  matrix  between  all  N 
features  computed  over  the  entire  image.  The  objective  of 
this  phase  is  to  decorrelate  the  features  such  that 


INDEX 


FEATURE  DESCRIPTION 


FEATURE  CLASS 


monochrome  brightness 
' re3‘ color" Brightness 
green  color  brightness 
blue  color  brightness 


band  1 brightness 


band  6 brightness 


Sobel  magnitude  on 
Sobel  magnitude  on  X2 


Sobel  magnitude  on  x. 


monochromatic  amp 


color  amplitude 


multispectral  amplitude 


Sobel  phase  on  x^q 
mode  filter  on  x^ 

mode  filter  on  x^q 
dispersion  filter  on  x 

dispersion  filter  on  x 


nonlinearly  filtered  feature 

1 

10 


clustering  is  implemented  in  N dimensional  decorrelated 
space.  In  this  nay  good  features  can  be  selected 
individually  and  bad  features  rejected  individually  without 
concern  as  to  correlation  properties  with  other  features. 
This  will  allow  efficient  compaction  of  good  clustering 
features  into  a few  parameters  thereby  providing  a large 
dimensionality  reduction.  However  it  is  important  to 
realize  that  feature  reduction  does  not  occur  immediately 
following  the  rotation  process  but  only  subsequent  to 
clustering  analysis. 

This  brings  us  to  the  next  step  in  the  system  which  is 
a k-means  clustering  algorithm  in  N dimensional  rotated 
space.  This  algorithm  converges  to  a set  of  k-mean  points 
describing  the  best  assignment  of  pixel  features  to 
k-clusters  such  that  the  sum  of  within  cluster  distances  is 
the  smallest.  The  disadvantage  of  the  algorithm  is  that  it 
requires  knowledge  of  the  number  of  clusters,  k,  in  advance. 
Clearly  this  is  unknown  and  consequently  the  k-means 
clustering  routine  must  be  implemented  for  all  reasonable 
values  of  k (i.e.  k*l , . . . , 16) . Subsequent  blocks  in  the 
figure  are  designed  to  determine  the  best  number  of  cluster 
and  the  best  features  to  provide  the  tightest  cluster 
distributions. 

Once  the  k-means  cluster  algorithm  has  converged  to  the 
minimum  spread  of  points  in  N space,  a fidelity  measure,  @, 
is  computed  to  establish  the  tightness  of  the  points  within 
the  clusters  and  the  degree  of  spread  or  separateness  of  the 
clusters  one  from  another.  This  fidelity  measure  is  given 

by 

8<k)  - tr[Sw(k)]  tr[Sb(k)] 

where  (Sw(k))  Is  the  within  cluster  scatter  matrix  and 
(Sb(k))  is  the  between  cluster  scatter  matrix  [2],  It  can 


be  shown  that  6 is  everywhere  nonnegative,  has  at  least  one 
maximum,  and  achieves  that  maximum  where  the  ratio  of  the 
within  cluster  scatter  equals  the  between  cluster  scatter. 
Therefore  it  is  hypothesized  that  the  optimal  number  of 
clusters  (k)  occurs  at  6 equal  to  its  maximum.  Therefore 
these  values  of  B and  k are  used  to  control  the  output 
segmentor  and  the  feature  rejector. 

The  feature  rejector  provides  the  function  of  removing 
those  features  which  do  not  contribute  to  tight  homogeneous 
clusters.  Consequently,  this  process  borrows  from 
supervised  pattern  recognition  theory  in  which  feature 
selection/rejection  is  often  implemented  through  the  use  of 
the  Bhattacharyya  distance  function  [3],  This  function 
provides  a measure  of  the  usefulness  of  a particular 
dimension  or  feature  by  investigating  that  feature's  ability 
to  separate  the  data  points  into  the  proper  clusters 
determined  by  the  k-means  convergence  algorithm.  This 
measure  is  provided  by  mean  and  variance  parameters 
determined  by  each  dimension  for  all  the  clusters.  Those 
features  or  dimensions  which  do  not  provide  well-defined 
clusters  (due  to  separate  means  and  tight  variances)  are 
rejected,  thereby  leaving  good  features  for  more  tightly 
homogeneous  clusters. 

Experimental  Results 

A variety  of  images  have  been  segmented  using  the  above 
clustering  algorithm  with  varying  degrees  of  perceptual 
success.  Figures  2 and  3 present  these  results  in  pictorial 
form.  Figure  2a  and  3d  were  original  monochrome  images 
while  figure  2d  was  a color  image  and  figure  3a  was  a ten 
band  multispectral  image.  Various  clustering  results  are 
presented  for  each  image  for  viewer  inspection.  The  last 
sequence  in  figure  3 represents  clustering  on  frame- to- frame 


a)  Original 


b)  2 Clusters 


e)  2 Clusters 


Figure  2.  Pictorial  Clustering  Results 


e)  4 Clusters  (frame  1) 


f)  4 Clusters  (frame  5) 


imagery  to  illustrate  the  potential  for  real  time  hardware 
smart  sensor  implementation. 

Probably  a more  relevant  representation  of  the 
segmentor  in  operation  is  to  view  the  Bhattacharyya  measures 
and  clustering  fidelity  factors  all  as  a function  of  k,  the 
number  of  clusters  for  each  iteration  of  the  k-means 
clustering  algorithm.  These  results  are  presented  in 
figures  4 and  5.  In  figure  4 two  plots  are  presented 
illustrating  the  performance  of  the  Bhattacharyya  feature 
rejector.  In  figure  4a  the  Bhattacharyya  distance  values 
are  plotted  for  each  dimension  or  feature  in  the  correlated 
space  for  the  variables  { x^ , X2 , . . . ,x^}  from  figure  1.  In 
figure  4b  the  Bhattacharyya  distances  are  plotted  for  each 
rotated  dimension  or  feature  in  the  decorrelated  space  for 
the  variable  (y^  ,^2  • • • °f  figure  1.  It  is  immediately 
obvious  that  by  decorrelating  (rotating  the  space)  one 
outstanding  feature  results  which  hopefully  will  allow 
effective  clustering  in  a vastly  reduced  vector  space  (see 
figure  2b).  In  addition  it  is  obvious  that  the  good 
features  (large  Bhattacharyya  values),  tend  to  be  good  for 
all  cluster  numbers  indicating  a degree  of  consistency  which 
allows  feature  rejection  of  those  dimensions  with  small 
Bhattacharyya  measure  with  some  degree  of  confidence. 

Figure  5 indicates  how  the  cluster  fidelity  parameter, 
B,  behaves  as  the  number  of  clusters  increases. 
Specifically,  figure  5a  indicates  that  for  the  monochrome 
APC  image,  without  feature  rejection,  the  peak  of  B is  quite 
poorly  defined  because  of  the  presence  of  a lot  of  useless 
features  essentially  adding  noise  to  the  well-defined 
clusters.  However  for  the  case  of  the  four  best  features  or 
the  single  best  feature,  a much  more  marked  peak  results  at 
a lower  cluster  number.  A similar  effect  occurs  for  the 
colored  house  of  figure  5b.  However  from  the  curves  of  all 
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features  compared  to  the  best  few  features,  not  as  dramatic 
a change  occurs.  This  is  because  the  use  of  color  features 
provides  a considerable  improvement  in  the  segmentation 
power  of  the  system  compared  to  having  only  monochrome 
features.  This  result  correlates  well  with  our  intuitive 
experiences  in  which  color  and  multispectral  signatures 
provide  quite  useful  aids  for  human  visual  segmentation 
procedures. 

Conclusion 

The  above  description  covers  the  highlights  of  the 
segmentor  developed  by  Coleman.  The  interested  reader  is 
referred  to  reference  [1]  for  details  of  the  system.  The 
algorithm  represents  a bottom  up  attempt  at  automatically 
segmenting  imagery  without  the  aid  of  human  intervention. 
It  allows  any  conceivable  set  of  features  to  be  used  for 
clustering  but  reserves  the  right  to  feature  reject  those 
parameters  which  do  not  contribute  to  well-defined,  tight 
clusters.  The  technique  is  based  upon  the  principles  of 
mathematical  clustering  algorithms  in  N dimensional  vector- 
space.  The  underlying  hypothesis  for  success  of  the 
technique  is  based  upon  the  premise  that  tight, 
well-defined,  homogeneous  clusters  in  vector  space 
correspond  to  well-defined  homogeneous  regions  in  an  image. 
If  this  premise  is  true,  successful  (i.e.  consistent  with 
human  perception)  segmentation  results.  If  unsuccessful 
segmentation  results,  then  improper  features  are  provided  in 
the  feature  computation  phase  which,  through  the  linear 
operations  of  decorrelation  and  feature  rejections,  do  not 
provide  proper  region  segments.  It  is  then  conjectured  that 
nonlinear  transformations  (or  other  features)  are  necessary. 
Finally,  the  segmentor  has  been  designed  with  smart  sensor 
real  time  implementation  in  mind.  The  hardware  construction 
of  such  a system  is  under  contemplation. 
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3.  Image  Processi ng  Projects 


This  section  of  the  report  describes  the  efforts 
expended  and  results  obtained  over  the  past  six  months  on 
the  various  image  processing  projects  carried  out  at  the 
Institute.  Some  of  these  projects  have  been  funded  by  other 
sources  is  indicated  in  the  appropriate  title 
acknowledgements.  Results  of  work  in  defining  the  degrees 
of  freedom  inherent  in  radar  imaging  systems  are  presented 
for  the  stripping  mode  of  SAR.  It  is  expected  that  future 
radar  imaging  model  degrees  of  freedom  analysis  will  lead  to 
efficient  radar  image  understanding,  a task  which  for 
humans,  is  far  more  difficult  than  visual  image 
understanding.  This  section  also  summarizes  the 
psychovisual  modelling  work  being  applied  to  image  coding 
and  image  rate  distortion  functions.  Results  in  blind  a 
posteriori  image  restoration  are  next  presented.  Finally 
two  new  optical  processing  procedures  are  described  in  which 
on-axis  holographic  optical  filtering  is  developed,  and 
optical  pseudocoloring  of  texture  information  is  described. 


3.1  Synthetic  Aperture  Radar  and  Imaging  System  of  the 
Stripping  Mode 

Chung-Ching  Chen 


Introduction 

The  concept  of  radar  is  relatively  simple  although  its 
implementation  often  is  not.  It  is  an  active  device  which 
operates  by  radiating  electromagnetic  waves  and  estimating 
the  characteristics  of  the  target  by  the  echoes  returned 
from  the  objects.  Since  its  appearance  in  World  War  II, 
radar  has  played  a very  important  role  in  both  military  and 
civilian  applications,  such  as  the  target  detection, 
navigation  of  the  ships  and  aircraft,  etc  [1].  The  purposes 
of  the  radar  can  be  dichotomized  as  target  detection  and 
parameter  estimation.  Detection  of  a target  is  the 
determination  of  its  presence  in  the  unavoidable  noisy 
situation,  and  parameter  estimation  is  the  measuring  of 
characteristics  of  the  targets,  e.g.,  their  ranges,  relative 
velocities,  angular  directions,  sizes,  etc,  by  the 
extraction  of  available  information  from  the  received  echoes 
[1J. 


The  power  of  an  airborne  or  spaceborne  ground  mapping 
radar  is  limited  by  its  resolving  abilities  in  both  azimuth 
and  range  directions,  whereby  azimuth  is  "along  the  flight 
track"  and  range  is  that  perpendicular  to  it  on  the  ground. 
Range  resolution  can  be  achieved  by  the  radiation  of  a short 
pulse  and  the  accurate  measurements  of  the  time  of  its 
return.  Alternatively,  the  pulse-compression  techniques  can 
be  applied  to  obtain  similar  resolution  with  greatly 
Increased  signal  power  (2).  Here  a suitable  modulation, 
usually  a linear  F.M.  or  chirp  signal,  because  of  its  high 
efficiency  in  terms  of  time- bandwidth  product,  easy  analysis 
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and  implementation,  is  imposed  upon  a CW  of  moderate  time 
duration.  In  this  technique  the  large  bandwidth  which  is 
necessary  for  accurate  range  resolution  is  created  while  a 
large  signal  power  is  also  achieved  because  of  the 
"stretching"  of  the  signal  duration  compared  to  short  pulse 
modulation,  which  has  a large  bandwidth  but  short  time 
duration.  Upon  receiving  the  echoes,  the  data  processor 
"compresses"  the  large  time-bandwidth  product  pulse  to  reach 
virtually  the  same  range  resolution.  It  can  be  shown  that 
the  signal  is  essentially  the  point  spread  function  (PSF)  of 
the  radar  imaging  system  in  the  range  dimension.  Although 
the  signal  is  long  in  time  and  thus  the  PSF  is  wide  in  range 
space,  the  range  resolution  is  not  degraded  because  of  the 
high-bandwidth  property  of  the  signal.  Hence  the 
simultaneous  achievements  of  high  resolution  and  large 
signal  power  are  possible.  It  is  pointed  out,  however,  that 
because  the  pulse  is  in  coded  form,  a decoding  scheme  which 
in  this  case  is  pulse  compression  which  is  coherent  in 
principle,  has  to  be  adopted,  as  contrasted  to  noncoherent 
processing  for  traditional  range  resolution  schemes.  The 
coherent  processing  provides  the  possibility  of  another 
point  of  view  on  the  imaging  system  - the  optical  or 
hologram  processing. 

From  antenna  theory  it  is  well  known  that  the  half 
power  beam  width  8 in  radians  of  a physical  antenna  of 
length  L is  (see  figure  1) 

8 - A/L  (1) 

where  A is  the  wavelength  of  the  radiation.  The  groundwidth 
projected  by  the  half  power  beam  is  then  where  R 
is  the  range. 
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Traditional  ground  mapping  radar  achieves  azimuth 
resolution  by  using  antenna  with  beam  narrow  in  azimuth. 
Physically  speaking,  the  narrower  the  beam  width,  the 
narrower  the  point  spread  function  and  thus  the  better 
resolution  obtainable.  This  is  the  same  situation  as 
discussed  in  the  case  of  uncoded  short  pulses  used  to  obtain 
range  information.  It  is  again  noted  that  here  the 
illumination  imaging  is  noncoherent  among  the  patches 
illuminated  at  different  antenna  locations.  Thus  the  wider 
beam  width  means  more  blurring,  hence  fine  azimuth 

resolution  demands  a very  long  physical  antenna  (L  large  in 
RX 

-£-) , most  of  the  time  not  practically  available. 

A comparison  between  the  imaging  schemes  along  range 
and  azimuth  directions  suggests  that  high  azimuth  resolution 
is  possible  by  the  use  of  some  forms  of  coding.  However, 
since  the  ground  is  two  dimensional  (range  and  azimuth) 
while  the  signal  is  only  one  dimensional  (a  function  of  time 
only)  a direct  simultaneous  compression  upon  the  signal 
itself  is  not  conceivable,  if  not  impossible.  One  solution 
is  to  azimuth-modula te  the  returned  echo,  instead  of 
modulating  the  signal  before  radiation.  This  is  made 
possible  by  the  constantly  changing  relative  location 
between  the  ground  points  and  the  radar.  The  technique  thus 
developed  is  called  synthetic  aperture  radar  (SAR) . 

The  azimuth  resolution  abilities  can  be  evaluated  from 
another  point  of  view.  The  aircraft  during  its  flight 
radiates  pulses  at  different  locations  and  receives  their 
echoes  shortly  after.  Because  of  the  azimuth  modulation 
incurred  by  the  relative  motion  of  the  craft  and  ground, 
coherent  processing  upon  the  received  data  to  obtain  maximum 
possible  resolution  is  required  as  explained  earlier.  This 
is  analogous  to  the  case  of  an  antenna  array  where  the 
received  signals  at  each  array  element  are  coherently 
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processed  and  summed  [3]. 


From  figure  2 it  is  easily  seen  that  the  width  of  the 
azimuth  beam  at  range  R gives  the  maximum  value  for  the 
length  of  synthetic  aperture  that  can  be  used  at  that  range. 
Hence 

. _p  0_  R A 

Leff  RB  TT  (2) 

is  the  effective  length  of  the  synthetic  antenna  aperture 

where  L is  the  azimuth  size  of  the  physical  antenna. 

However,  since  the  operation  of  SAR  utilizes  the  two-way 

beam  pattern  in  the  sense  that  phase  shift  is  introduced  on 

both  the  paths  and  from  the  target,  this  round-trip  phase 

shift  effectively  reduces  the  wavelength  by  a factor  of  2. 

Thus 


»eff  -2 »> 

The  azimuth  resolution  6„  is  the  effective  beam  width 

a 

projected  on  the  ground  at  range  R 


- 8effR 


2T 


XR 


eff 


X R 

rtor 


L 

7 


(4) 


which  is  independent  of  * and  R,  and  is  proportional  to  the 
azimuth  size  of  the  physical  antenna.  Thus  to  achieve 
higher  azimuth  resolution,  a shorter  antenna  is  to  be  used, 
in  contrast  to  the  single  antenna  case  in  eq.(l). 


Basically  there  are  three  modes  of  ground  mappings  of 
the  SAR  [4,5]:  spot  mapping,  strip  mapping,  and  Doppler 
beam- sharpening . Strip  mapping  is  the  most  general  one  and 
has  received  the  most  attention.  Some  of  its  principles 
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location  of  aircraft 


Fig.  2.  Analogy  between  antenna  array  and  synthetic  antenna 
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have  been  given  in  the  preceding  discussion.  In  the 
following  sections  we  will  formulate  the  strip  mapping  mode 
from  a system  point  of  view. 

Strip  Mode  of  SAR 

I ntroduct ion 


The  strip  mode  is  the  most  general  one  in  the 
operations  of  SAR's.  In  the  operation  an  airplane  or 
spacecraft  flies  over  the  ground  of  interest,  radiates 
pulses  at  different  locations  and  records  the  returned 
echoes.  As  shown  in  figure  3,  let  (x,y,z)  be  a rectangular 
coordinate  system  with  (x,y)  being  the  ground,  and  assume 
unless  otherwise  stated  that  the  flight  is  along  the  x axis, 
with  y-0  and  at  constant  velocity.  For  simplicity  a 
side-looking  radar  is  also  assumed,  although  it  could  as 
well  be  squint  [5] . The  situation  is  depicted  in  figure  3, 
where  we  use  (x^*y^»z^)  and  (X2,y2*z2^  t0  denote  the 
coordinates  of  an  arbitrary  target  point  and  the  aircraft, 
respectively.  We  will  assume  that  effect  of  the  height  of 
mountains  and  structures  on  the  ground  are  negligible  and 
that  the  curvature  of  the  earth  surface  can  be  ignored  such 
that  zj*0  for  all  the  ground  points  to  be  mapped. 
Furthermore  we  assume  that  the  time  origin  coincides  with 
the  x origin  of  the  aircraft.  Thus  X2*vt  where  v is  the 
constant  velocity  of  the  aircraft.  As  is  generally  the 
case,  the  antenna  is  assumed  to  be  shared  by  the  transmitter 
and  the  receiver.  This  necessitates  the  pulsed  nature  of 
the  signal  waveform  and  inevitably  creates  blind  ranges  [1]. 
Echoes  from  targets  in  blind  ranges  reach  the  radar  while  it 
is  transmitting  and  not  receiving  and  thus  are  lost.  The 
pulse  repetition  frequency  (PRF)  also  sets  an  upper  bound  to 
the  maximum  range  without  range  ambiguities.  For  simplicity 
of  analysis  we  assume  that  the  signal  pulse  train  consists 
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Fig.  3,  Flight-1 
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of  pulses  of  identical  waveform  with  equal  time  interval 

between  adjacent  ones.  Thus  let  f ( t ) be  the  modulation 

8 

function  of  a single  pulse  centered  at  t“0  then  the  pulse 
train  wavefunction  is 


f(t)  = fs(t-nTs)exp(ju>ct) 

n - -<»  ^ ^ 

where  T is  the  pulsing  period  and  id  the  angular  frequency 
of  the  carrier.  As  described  in  figure  4,  if  the 
"effective"  time  width  of  f8(t)  is  Tp  then  the  length  of  the 
time  during  which  the  transmitter  is  not  in  use  between 
consecutive  pulses  is  Tg-Tp  which  decides  the  maximum  range 
deviation  without  range  ambiguities. 

In  addition  to  those  factors,  the  depression  angle  ^ , 
which  is  the  angle  between  the  horizontal  plane  and  the 
radiated  beam,  and  the  antenna  pattern  also  affect  the 
performance  of  the  system;  the  geometry  is  depicted  in 
figures  5 and  6.  All  those  parameters  and  their 
interdependence  come  into  the  picture  of  the  SAR  imaging 
system  to  make  it  extremely  complicated  to  evaluate  its 
capabilities  and  estimate  its  performance  accurately. 
However,  if  we  model  the  point  spread  function  (PSF)  of  the 
system  in  some  desirable  way  by  appropriate  geometrical 
considerations  and  approximations,  we  will  be  able  to 
simplify  the  description  of  the  system,  making  the 
evaluation  relatively  easier  and  the  reconstruction  more 
feasible.  Of  course,  by  so  doing  we  also  distort  the  system 
by  an  inexact  modelling,  thus  an  incomplete  or  nonoptimal 
(in  some  sense)  reconstruction  is  to  be  expected.  It  is 
obvious  that  the  more  approximations  made,  the  more 
degradation  will  result  in  the  image  reconstruction.  In 
this  section  we  will  give  a hierarchy  of  the  system  models 
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Fig.  4.  Waveforms  of  Signal* 
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Fig.  5.  Flight-path  geometry  - (y,  z)  plane 
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Fig.  6.  Flight-path  geometry  - elant  plane 
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in  progressive  inaccuracies.  We  will  also  tabulate  the 
approximations  made  and  their  justifications.  We  will  find 
that  in  its  simplest  form  the  system  is  separable 
spa  ce-  i nva  r ia  nt . 

Method  of  system  evaluation 

In  general,  gathering  more  data  provides  more 
information  to  solve  for  the  unknowns  at  the  expense  of 
increased  requirements  in  storage  and  complexity  in 
computation.  On  the  other  hand,  intuition  suggests  that 
after  some  "threshold  amount"  of  data  is  obtained,  the 
additional  observations  do  not  always  provide  equal  amounts 
of  new  information.  This  is  due  to  the  inherent  "blurring" 
of  the  imaging  systems  and  unavoidable  observation  noise, 
etc.  Thus  the  concept  of  degrees  of  freedom  (DOF)  has 
arisen  to  measure  the  number  of  truly  independent  samples  of 
data  one  gathers  under  a particular  imaging  system  [6,7]. 

In  our  system  evaluation  we  will  adopt  the  concept  of 
eigenvalues  of  a correlation  matrix  or  the  Gramian  matrix 
[8,9].  For  example,  in  the  continuous-discrete  case,  we 
equate  the  number  of  degrees  of  freedom  with  the  number  of 
the  eigenvalues  of  the  Gramian  matrix  whose  magnitudes  are 
larger  than  some  threshold  determined  by  the  noise  level  of 
the  system.  This  is  equivalent  to  the  singular  value 
analysis  of  the  system.  Except  for  possible  permutation  the 
singular  value  spectrum  remains  invariant  regardless  of 
whether  the  object  and/or  the  image  have  gone  through 
orthogonal  transformation  before  and  after  the  imaging 
system,  respectively.  To  show  this,  consider  the 
discrete-discrete  case  for  the  sake  of  ease  in  proof:  Let  H 
be  the  matrix  of  the  linear  system  and  P,Q  be  orthogonal 
matrices  compatible  with  the  dimensions  of  the  output  and 
input  vector  sizes,  respectively.  Then  PtP»PPt»I  and 
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QtQ“QQt"I  where  the  superscript  t denotes  a transposition. 
The  set  of  eigenvalues  of  PHQ(PHQ) t-PHQQtHtPt-PHHtPt  is  the 
same  set  of  eigenvalues  of  PtPHHt*HHt  except  for  additional 
zeroes  due  to  possible  size  difference  of  P and  Q [10]. 
Note  that  the  sets  of  eigenvalues  of  PHQ  and  H differ  in 
general . 


Derivation  of  point  spread  function 

Referring  to  figure  3,  Zj*0,  x2*vt,  y2“0  and  Z2  is 
flight  height.  Define  ground  range 

Rg 4 [(xi‘x2)2+<yi-y2)2]% 

- [(x1-vt)2+y^]%  (6) 

and  slant  range 

R - [(x1-x2)2+(y1-y2)2+(z1-z2)2]^ 

- [(x1-vt)2+y2+z2J% 


The  propagation  delay  associated  with  a point  source  at 

2R 

<*ryr*l>  with  range  R defined  above  is  -jr  where  the  factor 
2 is  because  of  round  trip  to  and  from  the  target.  Let 
P(x1,y1)  be  the  reflectivity  function  of  the  terrain  and 
M*^*y^»x2,z2)  be  the  illuminating  pattern  of  the  antenna 
beam  on  the  terrain.  If  the  antenna  pattern  remains  the 
same  during  the  flight,  it  is  easily  seen  that 
R (x i# y i# x 2> z 2) "■ R ( x j^*x 2 , y ^ , z 2)  . The  received  echoes  as  a 
function  of  t are  the  product  of  illuminating  pattern  A, 
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terrain  reflectivity  P and  the  delayed  signal  function  f, 
summed  over  the  ground  coordinates  (x^,y^): 

z(t)  = | | A^-x^y^z^p^.y^f (t-^r)dxidyi  (8) 

— OD 

Substituting  eq. (5)  into  eq. (8) , 


z(c) 


j |A(x1-x2,y1(z2)p(x1,y1)  £ fs(t-^-nTs) 

-cd  n * -«> 

expjja^t-^JdXjdyj^ 


(9) 


If  we  interpret  eq. (9)  as  a system  with  P(x^,y^)  as 
input  and  z(t)  as  output  of  the  system,  it  is  obvious  that 
the  system  is  linear  with  point  spread  function 


where  nvTg  is  the  x coordinate  of  the  aircraft  at  which  n-th 
pulse  is  being  radiated.  It  is  assumed  that  during  the 
transmission  and  receiving  of  a single  pulse  the  aircraft  is 
approximately  stationary  so  that  x2  is  substituted  by  nvTs 
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in  eq.(10).  This  is  valid  if  (sufficient  conditions) 

a)  A(:<1-nvTa,y1,z2)  = ACx^nvTg  + Ax^y^Zj) 

b)  (|:c1-nvTs|  + Ax2) 2- (x1-nvTg) 2 <<  (y1+Ay1)2-y^  and 

°)  -^lxl"nvTg  |+Ax2)2+y2+z2J^-2  [(x^nvTg)  2+y2+z2]^  <<  Ac 

where  Ax2  is  the  maximum  distance  the  aircraft  travelled 
during  transmission  and  receiving  of  a single  pulse.  Ay^  is 
the  range  resolution  desired.  See  figure  7.  Note  that 
Ax 0 < vT  as  assumed  earlier. 

I s n \ 

(a)  is  easily  satisfied  by  noting  that  is  of  the 

order  of  hundred  or  thousand  meters,  and  is  therefore 
greatly  larger  than  Ax2,  which  is,  at  most,  of  the 
same  order  of  azimuth  resolution  desired. 

(b) 

(|x1-nvT8|+Ax2)2  - (x^-nvTs) 2 

o 

- 2Ax2 |x^-nvTg |+Ax2 

< 2Ax2(Lef£+Ax2)  ~ 2Ax2Leff 

while 

(yi+Ayi)2-yi 

- 2Ayt.y  + Ay2 
s2Ay1.y 

Since  ax2  is  of  the  same  order  as  Ay^  (potentially  best 
azimuth  resolution  vs.  range  resolution),  (b)  will  be  valid 
if 

Leff  <<c  y 1 (11) 
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Safest* 


(c)  :{[(|xrnvTj+Ax2)2+y2+z2]*  - [(x^nvT^  2+y2+z2]%  } 

,f,  2.  2,%.,dxrnvTsl+Ax2)2  /„21_2,%  (xl-nvTs) 2 ) 

= - (xi+z2)  — , 2:21 (yi+z2)  " } 

l (y1+z2)  (yi+z2>  J 

= ~2  3 i-  (|x,-nvT  | .Ax2+Ax?) 


- I Xj^-nvTg  | .Ax2 

(y^>* 


2LeffAx2 

< fj ir-T AXn 

- (y^)4  J 


However,  if  ax2  <<  Tg  <*  azimuth  resolution  * range 
resolution,  then  * y ^ will  also  satisfy  (b) . Note  that 

AX2  is  approximately  proportional  to  the  maximum  range  of  yj 
illuminated  by  the  antenna  where  Lgf£  <<  y by  the  validi 
of  (b) . Thus  if  we  let  Ax2  be  of  the  same  order  as  Xfi,  (c, 
will  be  satisfied.  In  fact,  the  achievable  azimuth 
resolution  is  of  the  same  order  as  Xc  (13]. 


Because  the  sinusoidal  phase  term  exp{jwct}  in  eq.  (10) 
does  not  carry  any  information  on  p(x^,y^) , it  can  be 
shifted  to  any  lower  frequency  u>0  desired.  In  optical 
processing  upon  SAR  data,  the  "offset”  frequency  u>0fH)  is  to 
separate  reconstructed  twin  images  from,  each  other  and  from 
other  useless  images  (11,121. 

Thus, 
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h(t;x1,y1)  = exp{  j(DQt}£A(x^-nvT8  ,y-p  Zj) 


(t  nT,  2 [(«rnvT«>2^t4*i]^ 


(12) 


Although  the  return  of  the  pulse  train  from  the  two 

dimensional  target  field  is  one  dimensional  - i.e.,  function 

of  t only,  the  recording  of  data  could  be  two  dimensional. 

In  fact,  because  of  its  huge  capability  for  data  storage, 

film  has  been  used  mostly  for  data  recording.  Recalling 

that  the  signal  returns  from  different  pulses  do  not 

overlap,  we  could  arrange  them  in  a more  compact 

two-dimensional  format:  Let  be  an  axis  perpendicular  to 

the  t axis.  If  we  had  moved  the  n-th  pulses'  return,  which 

occurs  between  time  nT  and  (n+l)T  , left  nT  units  towards 

s s s 

t»0  and  also  move  nvT  units  along  new  axis  x.,  we  would 

s 2 

have  had  the  following  configuration  of  the  data  shown  in 
figure  8. 

Note  that  S(X2*t)  is  nonzero  only  for  0 < t < Tg  and 
that  X2  is  a discrete  variable  occurring  at  nvTs  only.  The 
reordering  of  the  data  from  z(t)  to  S^ft)  is  an  orthogonal 
transform  and  the  set  of  singular  values  remains  the  same. 
Note  that  X2  and  t have  dimensions  of  length  and  time, 
respectively. 


The  PSF  expressed  in  (*2'*)  variables  now  becomes 


■ 
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, . - ; Xx . yx)  6(x2-nvTB)exp^ju0(t4uTg)jA(x1-X2,y1-,Z2) 


n ■ 
f 


(t  2[(*r»2>M^.rL.^ 


where  0 < t < T, 


8 


Multiplying  exp(-j<oonTg)*exp(-ja>o-^)  and  expf-jo^t) 


respectively  to  h^ , t;x^ ,y^)  yields  range  offset  and 
azimuth  offset  cases,  respectively  [14].  Thus 


h(x2.t;x1.:1)  ” 

!xp(ju0t)  53  «(x2-nvTs)A(x1-x2,y1,z2) 

2 [ (x1-x2)2+y^+2ffi 


(t  2[(xx-»2)  V^z^xpj, 


2 2 

+yi+z-' 


s\-  c 

range  offset  case 


j<u, 


I 


(1^ 


h(x2,t;  x1.71) 


b>g  v / 2|"(x.-x2)  '*^ri'*‘Z2 

exp(j-^x2)  «(*2-nvTg)A(x1-x2.y1,z2)fgl  t — - 

n«-oo  \ 

j.Ju  ^(Xx-X;)2^^]^ 


exp 


azimuth  offset  case 


(i: 


In  the  following  analysis  we  shall  assume  that  the  range 
offset  case  is  used. 


Simplifications  of  PSF 


In  this  section,  we  continue  simplifying  the  PSF 
(eq.(14))  of  the  SAR  imaging  system.  In  the  meantime  a 


! 


hierarchy  of  models  of  PSF's  with  decreasing  complexity  will 
be  derived  along  with  their  associated  assumptions  and 
approximations. 

We  start  by  noting  the  strong  relations  between 
variables  x^  and  x2»  and  t,  respectively:  the  argument  of 
f8  in  eg.  (14) , t-^2  [(x1-x2)  2+yJ+z^J /c,  were  it  not  for  the 
factor  (x^-x2)2>  would  have  yielded  a propagation  delay 
which  connects  y^  with  t only  to  provide  the  range 
information,  and  is  independent  of  azimuth  modulation.  This 
is  the  only  way  by  which  h (x2 ,t»x^ ,y^)  is  not  separate  in 
azimuth  and  range  in  eq.  (14)  . If  this  fact  can  be  ignored, 
e.g.,  if  the  propagation  delay  induced  by  the  variation  in 
(xrx2)  is  much  smaller  than  the  range  resolution 
interested,  then  the  PSF  can  be  considered  separate  in 
azimuth  (x^-*x2)  and  range  (y-*-t) : 


OP 

h(x2,t;x1,y1)  -exp(JajQt)  6 (x2-nvTg)A(x1-x2,y1,z2) 


/ 2(yJ+z2)  \ j 2 f(x1-x2)+y J+z2  ] * 1 

exp  | " j <*>  c L.  - | 


Physically  speaking,  this  means  that  the  range 
resolution  cells  do  not  move  to  overlap  as  the  flight  goes 
on.  The  situation  is  depicted  in  figure  9,  where  the  range 
Y of  a target  point  is  plotted  as  a function  of  the  position 
x of  the  aircraft.  Various  ways  have  been  proposed  to 
alleviate  the  problem  of  range-azimuth  coupling  (15,16,17]. 
If  Pr  is  the  range  resolution  pursued  and  6 is  the  effective 
beam  width  then  this  range  curvature  effect  will  be  ignored 
if  | l (x^-x2) 2+y^+z2]^-(y[+z2)^|  < Pr  for  all  valid  x^-X2  and 
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Y = r(*l-x2)2+yi+Z2l* 


ition  of  a point  target, 
range  resolution. 
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yj_,  where  by  "valid"  we  mean  A(x  j-x^y^z^  is  significant. 
Assuming  (Xj-x2)  <<(y j+z2)^  as  in  e9*  <H)  and  B small,  we 

have 


2 2 

(yi+z2) 


(yi+*2>  + 7 


1 

7 


(xi-x2) 
— * — 
(yi+z2)* 


(xl~x2) 

— 2 — r~f 

(yl+z2> 


2 2 J. 

(yi+z2) 4 


< 1 
--  F 


ef  £ 

— 2 r- 5 

(y!+z2)% 


* 1 Lef f 

rf 

Liiat 

pr  > 1 Lef f 

(17) 

i.e.,  PSF  can  be  approximated  by  eq.(16),  which  is  separate 
in  azimuth  and  range,  if  eq.(17)  holds. 

To  see  that  the  system  with  kernel  eq.(16)  is  separate 
in  azimuth  and  range,  we  rewrite  eq.  (16)  as 

/ 2(y,1+*2> 

hCx^tjx^y^  - exp(jWot)f(  t 

> (x2-nvT8)A(x1-x2,y1,z2)exp 
r.*-® 

- ht(t;y1)hz(x2;x1,y1)  (18) 


-ja). 


2 (x. 


-x2) 


-fyI+z2 
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a 2(y|+zp* 

where  h t ( t ;y  -exp ( j u)Qt)  f g ( t ) is  the  impulse 

response  of  t due  to  a unit  point  source  at  y^,  which  is 
independent  of  azimuth  dimension;  and 

oo 

hz(x2;xl’yl)  * ^ 6(x2-nvTs)  A(x1-x2,y,,z2) 
n=-*>  2 

exp{-ju)c- 

is  the  impulse  response  of  x2  due  to  a unit  point  source  at 

( x i , y i ) . Note  that  hzfx2,xi'yl*  “hz(xi"x2,y  l>  is  space 

invariant  in  x^  and  x2,  but  varies  its  form  as  y^  changes. 
If  we  use  eq. (18)  as  the  kernel  of  SAR,  then  the 
input-output  relation  will  be 


[(x1-x2) 2+y£+z2] 


z(x2,t) 


^ t-r  OO 

-Jj  ht(t;y1)hz(x1-x 

*“  OO 

■ L [ L 


2>yib  Upy^dX]^  d yx 


hz(xi_x2;yi)  p(xi-yi> 


Thus  the  imaging  system  of  SAR  is  to  transform  p(x^,y^)  into 
z(x2,t)  in  a sequential  order:  azimuth  transformation 
followed  by  range  transformation.  However,  because  of  the 
dependence  of  h upon  y.,  h..  and  h are  not  separable  and 
thus  in  general  their  order  cannot  be  interchanged  in 
modelling  the  system.  Accordingly,  the  reconstruction  of 
the  ground  reflectivity  function  p(x^,y^)  from  its  image 
z(x2»t)  ha 8 to  follow  the  reversed  order. 

From  eq.(ll),  |x  ^-x  2 1<<  y 2 hence 
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e-xp  \ -Jou 


2 I (x1-x2)J  + y[+z% 


* exp{-jWr 


2(yf+zp  = + 


(x1-x2)J 

(yf+zp% 


= exp<- ju) 


2(yf+z|)' 


exp  -ja» 


(x1-x2)2 

Cyf+z|)^ 


and  thus  hz(x2;x^,y^)  can  be  approximated  by 

. , „ L 2<yi+z2>% ) v'.,„ 


hz<x2  :tl’yl)  * exP)"Jl 


)>  * 6(x2-nvTs)A(x1-x2,y1,z2) 


exp  -j  — 


<*>„  (Xn-X,)2 


I c <yi+*i>*  ) 

( 2(y2+z2)*> 

Because  the  phase  term  expj-ju)c - Vis  independent  of 

azimuth  its  effect  can  be  taken  out  and  absorbed  in  the 
reflectivity  function  p(x,y) . It  is  a nonlinear  operation 
on  p(x,y),  however,  its  effect  on  the  singular  values  of  SAR 
will  be  ignored.  In  terms  of  block  diagrams,  we  have  figure 
10  where 

L , / 2(y?+zP\ 

- expo0t)fs(t — c — ) 

OP 

h;<xr*2.yi>  A ^js(x2'nvT8)A(xi"x2'yiiZ2)  (19> 

n--» 

,uc  <xrx2)2 

exp  ■'•I# 


Equation  (19)  is  the  fora  assumed  for  most  of  the  processing 
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of  the  SAR  data.  All  the  assumptions  required  are  easily 
justified  to  reach  this  point.  We  shall  proceed  to 
approximate  eq.(19)  by  one  separable  in  azimuth  and  range, 
i.e.  such  that  h is  independent  of  yn  in  eq.(19).  This 
will  be  true  if 


(A) 

(3) 


Afx1-x2,y1,z2)  - A(x1-x2,z2) 


exp 


(xi-x,)2 


C / 3 | j \ 


for  all  valid  y^  and  y^' 


exp 


(A) 


. “c  (xl'x 2* 

C /..2  .,2^ 


can 


(yf+zp 

be  made 


true  by  an  appropriate  antenna  pattern 
geometrical  considerations. 


approximately 
design  with 


(B)  will  be  true  if 


ylmax+z2 


ylinln+z2  !*1  where  yimln».~  ,imax 
y^  coordinates  of  the  target  points  at  maximum  and  minimum 

ranges  covered  by  the  antenna  beam,  respectively.  We  shall 

not  elaborate  on  the  exact  form  of  the  inequality. 


and  y.mflv  mean  the 


It  is  pointed  out  that  in  general  (B)  cannot  hold 
practically.  However,  if  it  could,  then  the  system  of 
eq.(19)  would  be  separable: 


h(x2,t;x1,y1)  - ht(t;y1)  hz(x2-x1) 


(20) 


Further  theoretical  reduction  of  h is  still  possible: 

if  we  ignore  the  offset  frequency  term  and  assume  that 

yl>>z2  for  ail  valid  y^  such  that  2(y|+zp^/c  can  be 

approximated  by  ^yl  and  by  changing  variable  t2*  2yl 
c c 

■*) 

or  ht(t;t2)  - fg(t-t2) 


then 
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(21) 


' 


J 


hjCxj.tjXj.tj) 

ht(t-t2)hz(x2-xi) 


is  a separable  space  invariant  PSP  (SSIPSP) . 


A summary  of  the  properties  of  the  PSF's  under  various 
assumptions  are  listed  in  Table  1. 
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TABLE  1 


EQUATION 

NUMBER 

FORM  OF  PSF 

ASSUMPTIONS 

(13) 

(14) 

(15) 

NSVPSF1 2 

1.  separate  pulse  return  do  not 
overlap 

2.  aircraft  stationary  in  one 

Ts  time  or  Lgff  <<  y l 

(16) 

(18) 

NSVPSF 

separate  in  azimuth  and 
range  (order  counts) 

invariant  in  azimuth 

!•  Pr  > | Leff 

(20> 

SSVPSF3 4 

Invariant  in  azimuth 

1.  A(x1-x2,y1,z2)  * A(x1-x2>z2) 
2 yl  «.*«2  . ! 

y[  min+z2 

(21) 

SSIPSI^ 

1.  range  offset  frequency  u)Q=0 

*i 

1.  All  assumptions  in  upper  blocks  remain  true  in  lower  ones 

2.  Nonseparable  space  variant  PSF 

3.  Separable  space  variant  PSF 

4.  Separable  space  invariant  PSF 
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3.2  DOF  of  Tomographic  Projections 
Chung-Ching  Chen 


I ntroduction 

Recently  the  reconstruction  of  an  image  from  its 
projections  has  gained  more  and  more  interest  in  many  fields 
such  as  electron  microscopy  [1],  radio  astronomy  (2J, 
medical  tomography  [3J,  etc.  If  the  projection  systems  are 
ideal,  e.g.  if  the  projection  beam  is  infinitely  narrow,  it 
is  possible  to  reconstruct  the  image  perfectly  by  direct  [2) 
or  indirect  [1]  manipulations  of  the  projections.  However, 
in  reality,  the  projections  are  blurred  and  usually  only  a 
finite  number  of  projections  are  available.  Although  in 
general  we  could  obtain  more  information  by  more 
projections,  it  seems  that  because  of  the  inherent  blurring 
of  the  system,  there  is  an  "optimal"  amount  of  projections 
beyond  which  the  cost  of  increased  data  storage  and 
manipulation  is  not  worth  the  rapidly  decreasing  information 
return.  The  concept  of  degrees  of  freedom  (DOF)  has  thus 
arisen  to  quantify  the  number  of  truly  independent  data 
gathered  under  a certain  geometry  14,5]  . By  examining  the 
eigenvalue  spectrum  of  the  correlation  matrix  (or  the 
Gramian)  associated  with  the  linear  imaging  system,  one  is 
able  to  evaluate  the  system  performance  in  the  sense  of  DOF. 
Roughly  speaking,  in  the  eigenspace  of  the  system  only  those 
components  (or  singular  values)  whose  magnitudes  are  larger 
than  the  noise  level  provide  information  about  the  input 
with  high  enough  confidence.  Discarding  those  singular 
values  of  relatively  small  magnitude  also  guarantees  an 
optimum  bandwdith  reduction  or  data  compression  in  the 
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minimum  mean  square  sense,  among  other  advantages  of 
eigenspace  manipulations.  As  is  well  known,  the  complexity 
of  the  diagonalization  process  of  an  arbitrary  N x N matrix 
increases  very  rapidly  with  N (of  the  order  ) [6],  in 
addition  to  the  fact  that  only  approximate  values  can  be 
obtained  because  of  the  iterative  nature  of  the  algorithms. 
This  limits  the  practicability  of  eigenspace  analysis  even 
for  moderately  large  N unless  the  system  (and  hence  its 
matrix)  is  well-structured  and  algorithms  to  fully  utilize 
its  structure  to  speed  up  the  computations  exist. 

In  this  paper  we  discuss  the  principles  of  operation  of 
a tomographic  projection  system,  formulate  its  Gramian 
matrix  and  seek  fast  ways  to  find  its  eigenvalues  for 
certain  sampling  functions. 

Formulations 

Referring  to  figure  1 let  the  rectangular  coordinate 
system  ( £, n)  be  fixed  on  the  object  whose  size  has  been 
normalized  to  be  contained  within  a unit  circle  and  (x,y)  be 
another  rectangular  coordinate  system  with  the  y-axis 
parallel  to  the  lines  of  projection  for  a given  source 
orientation  angle  9 defined  as  the  angle  between  axes  x and 
£ as  shown  in  figure  1.  The  relations  between  the 
coordinates  (x,y)  and  (£,n)  are 

x - £cos0+nsine  £ - xcosfi-yaine 

or 

y - -£sin0+ncos0  n " xsine+ycose  (1) 

For  each  0 Mr  samples  Y2»...YM  are  taken  corresponding 
to  Mr  non- ideal  line  projections  perpendicular  to  the  x-axis 
with  0 <,  T|(i  1.  The  number  of  samples  in  0 is  Ne,  with  6^, 
i-1,2, . . . ,M0  ranging  from  0 to  2 n.  Thus  the  total  number  of 
samples  is  and  the  system  is  a continuous-discrete  one 
(7]  because  the  input  variables  (€,n)  are  continuous  and  the 
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dinate  systems 


5- 


output  variables  (y,0)  are  discrete.  Let  f(£,n)  be  the 
object  density  function,  then  the  projection  is 

related  to  f(£,p)  as  a two  variable  Fredholm  integral 

z(Yk.0i)  “ j f(5.n)  h(0i,Yk;C.n)  d^dn  (2) 

where  h(0^,y^;^,  n)  is  the  point  spread  function  (PSF) 
designating  the  contribution  of  a unit  magnitude  point 
spread  source  at  ( K,  n)  to  the  output  variables  ( 0^  Yk)  and  R 
denotes  the  unit  circle.  Note  h(0^,  Yk;C,  n)  can  be  rewritten 
as  h ( 0 Yk?x,y)  through  the  transformation  of  eq.(l).  We 
assume  that  the  projections  are  aimed  at  Yk,  k*l,...,Mr,  and 
assume  the  same  waveform  id(x)  in  the  x direction.  Then 
h(0i'  Yk'x,y)  = U)<x_Yk)  is  3 shifted  version  of  u (x)  . Thus 
eq. (2)  becomes 

“ ||  n)  a«(ccose1+nsinei_Yk)  d£dn  (3) 

where  we  have  used  the  relation  of  eq. (1)  with  the  addition 
of  subscript  i.  Note  that  x^  and  y^  are  continuous 
functions  of  £ and  n , although  the  functions  depend  on 
discrete  variables  because  of  the  discrete  sampling  over 
the  whole  circle. 

In  terms  of  matrix  notation,  eq. (3)  assumes  the  form 

[Z]  - ||  [H(c.n)]  f(5.n)d?dn  (4) 

J R 

where  [Z]  and  [H(£,  n)  1 are  matrices  of  size  Mr  x M0.  For 
our  purpose  it  is  more  useful  to  put  eq. (4)  in  a vector 

z-  | | h(5,n)f(e.n)dcdn  (5) 

R 

where  s and  h are  M N.  x 1 matrices.  Degrees  of  freedom  of 
— — x u 
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the  imaging  system  can  be  derived  from  the  eigenvalues  of 
the  following  correlation  matrix 

[r]  - J j hU.n)ht(c.n)dCdn  (6) 

R 

If  the  image  vector  z is  formed  in  a lexicographic  order 
then  the  matrix  will  be 


[r]  - 


f Cr/ 1 ’ l > . [rf1,2? {r]  (1>M0>  i 


L [r]  ^Me  ’ ^ ,[r]  ^Me  • 2^ £r]  (Me  • Me>-I 


(7) 


where  lr]^’mis  the  correlation  matrix  between  i-th  an  m-th 
samples  in  0.  The  (k,l)-th  entry  of  (rf^,tn^is 

Y(k;T)  - 1 1 “<xi-V  d*dn  (8) 

R 

where  x.  and  x are  defined  in  eq. (1)  with  0 ■ 04,  0 ■ 0 . 

respectively  and  (otx^-y^)  is  the  pulse  function  along  the  x^ 
axis  shifted  by  units.  A similar  geometric 

interpretation  for  exists.  The  situation  is 

depicted  in  figure  2 where  for  pictorial  clarity,  finite 
width  of  w(t)  is  assumed.  The  integration  of  eq.  (6)  is  over 
the  overlapped  region  of  W(*l-Yk)  and  which  lies 

within  the  unit  circle.  Unfortunately  there  is  no  closed 
form  for  the  expression  of  eq. (8)  for  arbitrary  function 
u(x) . A major  difficulty  is  that  the  integrand  is  limited 
to  the  unit  circle  Instead  of  the  whole  plane.  Appropriate 
assumptions  on  the  shape  of  w(x) , as  are  usually  practical. 
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fMMfc* 


can  alleviate  the  problem  to  yield  satisfactory 
approximations  of  eq.(8). 


Sharp  Projections  and  Their  Approximate  Correlations 

Let  u 8 assume  that  the  "effective"  width  of  w (x)  is 
much  smaller  than  1 or  the  maximum  radial  dimension  of  the 
object,  so  that  if  the  intersection  point  of  the  two  lines 
xi  “ Ykand  xm  * Yulies  outside  the  unit  circle,  the 

integration  in  eq. (6)  can  be  approximated  by  zero,  while  if 
the  intersection  point  lies  within  the  circle,  the 

integrand,  which  is  shown  geometrically  in  figure  2,  can  be 
appropriately  replaced  by 

m 

Y(k[?)  * | | “<xi>  d?dn  ^ 

-oo 

where  the  limits  of  the  integral  have  been  extended  to 
infinity.  In  other  words,  we  approximate  eq. (8)  by 

00 

Y(k!?)  " iXi.m.k.l)  J J w(*1)w*(xJn) dedn  (10) 

where  U(i,m,k,l)  is  a mask  function  defined  as 

xi’,k 

U(i,m,k,l)  - 1 if  solution  of  lies  w£thin  unit  circle 

*m-Yl 


- 0 , otherwise 

He  shall  find  a mathematical  expression  for  U(i,m,k,l): 
Ccosej+ne infl  j*Y ^ 

(11) 

5cose|1+nsin0n-Y1 

The  solution  (£<,»%)  of  eq.(ll)  is 
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^k«ln9m-Y1sln9t 
^0  * 8tn(0ra-ei) 

Ylco80i-Ykco89m 

n0  " “sin  (0-0 

in  x 

and  ( £Q»n0>  lies  in  a unit  circle  if  and  only  if 
4*^1  < 1 1ft 

[YkfYl'2'rkY  Lcoa(0m"6l)]  ^ ll8in,'eni-9i^  I 

Figure  3 shows  a geometric  interpretation  of  the  above 

inequality,  which  is  an  even  function  of  0 -0..  Thus 

m I 

17(1,3, k,  1)  - 1,  if  [Y2+Y2.2YkYic°s(0ra-ei)] ^sln^-S^I 
- 0,  otherwise  (13) 


Diagonalization  of  the  Gramian  for  an  Infinite  Domain  w (x) 
Function 


Let  us  assume  that  u(t) 
2 

with  variance  o , i.e. 


has  a normalized  Gaussian  shape 


u<x) 


(14) 


Let  u be  the  bisector  of  the  angle  formed  by  and  Xj^ 
and  v be  the  axis  90°  counterclockwise  from  u,  as  in  figure 
4.  since  the  angle  between  u and  x<  is  6m"ei  , we  have  the 
following  transformation  relations: 
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3 


i 


e -e . 0 -0i 

- u cos-S* V sin-qy  ■ 


xm  - u cos 


9-0.  0-01 

_®i  +v  sin-.,  1 


We  are  now  in  a position  to  find  a closed  form  for  eq. (9) , 
in  this  case  in  terms  of  variables  u and  v: 


Hence 


jjaKxpu^x^dCdn 

— CD 

00 

“ ||w(x^)u)*(xm)dudv 

— CD 

00  00 

" TW^  \ exP{_^-  co*2  V6i>duf  exp(-jisin2  Vei}  dv 

A<®  2 1»  • ° 9 


- csc|0m-01| 


Because  eq.  (16)  blows  up  at  0 >*e. , a more  accurate 

m i 

approximation  for  this  case  Is  required.  Assuming 
negligible  overlap  between  adjacent  pulses  with  the  same 
orientation  as  before,  we  consider  the  case  Yt.  aYi  # x4  • x 

K 1 X ID 

only  as  in  figure  5.  In  this  case  we  modify  the  limits  of 
the  integral  of  variable  v to  get 
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l dv 


- _L_  . X-Y  l 


o/ii 


Thus  we  have  the  following  expression  forY^“j 
Y(k!l)  “ U(l,n,k,l)  c8c|0ni-e1|  ,e„/ei 


(17) 


1 

o/? 


V9i-  Yk"Yi 


- 0 


W Yk^i 


(13) 


where  U(i,m,k,l)  is  given  by  eq.(13).  Mote  that  eq. (18) 
iaplies  that  the  adjacent  O' s are  not  so  close  as  to  make 
%U(l,a,k,l)  csc^-B^)  larger  than  either  or 
j^jf/l-Y[.  If  this  should  happen  in  eq.(lB),  an  appropriate 
thresholding  function  should  be  incorporated  in  the 
expression.  This  is  the  result  of  the  Schwarts  inequality 
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j[“<W“<Vlri)d5d,> 

R 

- [|/u2(xrYk)d£dn  |Ja,2(xm"Yl)dCdn]% 

< (x1-Yk)d€dn)7 f.« < VYl)dcdn» 


where  to(x)  is  a nonnegative  function.  Note  thatY^’™? 
depends  on  9^  and  0^  only  through  their  difference,  a fact 
very  useful  in  diagonalizing  [ r ] to  find  its  eigenvalues. 
In  addition  Y^’i)  *8  an  even  function  of  0^-0^.  These 

phenomena  make  the  Gramian  matrix  look  like 


W 


[r](1,1>. 

[r]<l,2>. 

[r] 

[r](l'2>l 

63a-», 

m(1-2>.. 

[r] (1,3) 

[r]<1'2), 

[r]d.3) 

...ra<l-2>. 

[r]  (i.D 

a circulant  matrix  whose  elements  are  symmetric  matrices  but 
are  not  necessarily  circulant  themselves.  It  is  pointed  out 
that  the  thresholding  function  of  eq.(18)  as  mentioned 
earlier  does  not  affect  the  circulant  property  of  eg. (19). 
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Past  algorithms  to  diagonalize  eq.(19)  exist  (8]. 
Because  of  its  circulant  property  with  elements  [ r )^ ' [T] 
can  first  be  block  diagonalized,  a process  similar  to  the 
one  diagonalizing  a circulant  matrix  except  that  here, 
entries  of  (H  are  matrices  of  sizes  Mr  x Mr  instead  of 
scalars.  After  block  diagonalization,  the  block  diagonal 
matrix  can  be  diagonalized  by  individual  block 
diagonalization  processes,  greatly  reducing  the  computation 
required. 

Summary 

The  imaging  system  of  tomographic  projections  has  been 
reviewed.  The  structure  among  the  datd  gathered  is  utilized 
to  yield  a computational  reduction  in  diagonalizing  the 
Gramian  matrix  of  the  system  for  projections  with 
Gaussian- shaped  blur.  Sharp  projections  are  assumed  to  come 
up  with  a closed  form  expression  of  the  elements  of  the 
Gramian  matrix,  although  other  approaches  based  on  different 
assumptions  also  exist  [8].  In  the  Appendix,  a rectangular 
pulse  form  is  assumed  and  the  corresponding  Gramian  derived, 
although  many  other  waveforms  are  also  possible  to  get  a 
closed  form  for  [TJ.  It  is  observed  that  they (£’£)»  of  the 
"autocorrelation”  of  the  pulses  increases  as  the  width  of 
pulses  decreases*,  while  i^m  or  the 
"cross-correlation"  is  independent  of  the  pulse  width  due  to 
cancellation  of  the  area  under  pulses  with  widths  of  pulses. 

In  the  ideal  case  where  the  pulse  is  infinitely  narrow, 
*thi8  phenomena  is  due  to  the  fact  that  the  value  of  a 
diagonal  entry  is  equal  to  the  peak  value  of  the  response  of 
a matched  filter  which  is  a measure  of  the  total  energy  of 
the  signal.  the  total  energy  of  a normalized 
Gaussian-shaped  signal  increases  as  its  "variance"  decreases 
as  can  be  easily  verified. 


* 


' 


I r 1 resembles  a diagonal  matrix. 


Appendix:  Gramian  Analysis  for  Rectangular -Shaped 

Projections 

He  assume  <*>(x)  ■ b rect(bx)  as  shown  in  figure  6 where 
rect(x)  - 1 -y  < x < y 

- 0 otherwise 

and  u)(x)  has  been  normalised  such  that  j «(t)  dt  ■ 1. 
Similar  to  the  case  with  Gaussian- shaped  projections,  we 
assume  ^ <<  1 so  that  the  approximation  schemes  carry  over 
here. 


Referring  to  figure  4 


u(xi)!j(xn)  - 


b' 

0 


if  6 parallelogram  ABCD 

otherwise 


Note  that  (x.,x  ) are  not  rectangular  coordinates 
l m 


|ja)(x1)u)*(xm)dCdn 
— 00 


dvdu 

D 


- b2  x Area (ABCD) 


<1> 


Since  OD  • 75  sec 

llu(x 


0 -e, 

and  AO  ■ 

2 

i)<D*(xm)dedn  - 


1 e -0., 
7E  cscjn^J,  * 

2b2  x AO  x OD 
csc|0o-0il 


<2> 
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Again,  eq.<2>  blows  up  at0m“ei*  So  for  i«m,  k*l,  we  again 
modify  the  limits  of  the  integral  in  eq.<l>. 


Y(k.k) 


b rect(bu) du 


ft 


1 dv 


-2b.X^f 


Thus 


- «Ci.m.k.l>cscJen|-el|  _ 


*2b 

- 0 


V 1 - 


e 1*0. 

nr  i 


V9i*  Yk=n 
Vei*  Yk  n 


<3> 


Again  depends  on  i,m  only  through  the  difference 
9j-e  . The  diagonalization  of  [rl  is  thus  similar  to  that 
discussed  earlier.  Thresholding  upon  the  values  of 
off-diagonal  entries  may  be  necessary  if  the  projections  are 
very  densely  sampled  compared  to  the  width  of  the  projection 
functions. 

Comparison  of  eq. (18)  and  eq.<3>  shows  that  if  o ■ 2b7n 
and  is  sufficiently  small  then  Gaussian  projections  and 
rectangular  projections  have  the  same  Gramian  matrix  and 
thus  from  the  DOF  point  of  view,  they  are  of  the  same 
per  forma nee. 
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3.3  Psychovieual  Transform  Coding  of  Images  (Supported  by 
WPAFB  under  Contract  F-33615-77-C-1016) 

Charles  F.  Hall 

Several  recent  papers  have  discussed  the  merits  of 
transform  image  coding  (1-6].  This  particular  research  is 
concerned  with  an  extension  of  this  technique.  The  images 
to  be  coded  are  first  preprocessed  with  an  algorithm  which 
is  based  on  a model  of  the  human  visual  system  (HVS)  [7]  and 
(8].  The  system  to  be  used  is  shown  in  figure  1. 

The  first  two  blocks  of  figure  1 represent  the  HVS. 
The  bandpass  filter  portion  is  implemented  with  the  equation 

A(fr)  - 2 . 6[0 . 0192+0 . 114f r]exp [- (0 . 114f r) 1 " L]  (1) 

(see  [8]  for  an  illustration  of  this  isotropic  filter 
function) . The  next  three  blocks  represent  the  coding 
portion  of  the  system.  The  preprocessed  image  is  cosine 
transformed,  quantized,  and  then  inverse  transformed.  The 
cosine  transform  may  be  taken  in  any  block  size  up  to  the 
full  image  size,  which  for  this  work  was  256  x 256.  The 
quantizer  was  a Max  quantizer  which  assumed  a Gaussian  pdf 
for  all  transform  coeff icients.  For  the  D.C.  term  a mean 
was  estimated  and  subtracted  from  the  D.C.  value  prior  to 
quantization  and  then  added  back  subsequently.  After  the 
transform  coefficients  are  quantized  they  are  inverse 
transformed  and  then  passed  through  the  final  two  stages 
which  represent  the  inverse  HVS. 

There  are  several  advantages  to  coding  by  transform 
blocks  smaller  than  the  image  size.  If  one  transforms  8x8 
blocks  for  example,  only  eight  lines  are  required  before 
processing  begins.  In  addition  only  two  8x8  covariance 
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[_ Human  Visual  System 


i 


Fig  2.  Mandala  Cosine  Transform 
(8x8  blocks) 


Fig  3.  Cosine  Coded  1 bit/pixel 
(8x8  blocksize) 

NMSE  = . 395% 
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matrices  need  to  be  diagonalized  and  used  to  compute  an 
8x8  bit  map.  To  visualize  how  this  bit  map  is  used  the 
entire  cosine  transform  domain  may  be  reordered  for  display 
purposes  into  a two-dimensional  Mandala  transform  domain 
[9J.  Such  an  ordering  of  an  8 x 8 block  cosine  transform  of 
the  GIRL  picture  is  shown  in  figure  2.  The  input  image  was 
256  x 256,  thus  there  are  1024  terms  for  D.C.  and  every 
component.  For  the  cosine  transform  there  are  64  unique 
frequencies  including  "0",  i.e.  D.C.  The  1024  terms  make 
up  a 32  x 32  array  which,  when  scaled  for  visible  display, 
form  a subimage  in  figure  2.  Every  term  in  each  subimage  is 
coded  with  the  same  number  of  bits  and  since  there  are  only 
8x8  subimages  our  bit  map  is  only  8x8.  Note  how  the 
increasing  harmonics  (left  to  right  and  top  to  bottom) 
represent  more  and  more  "edge"  information  and  the  highest 
harmonic  (lower  right  subimage)  is  almost  random  noise. 
These  particular  subimages  are  set  to  zero  in  the  coding 
process.  (The  reader  should  note  that  each  subimage  has 
been  scaled  for  display.  There  were  more  than  6 orders  of 
magnitude  differences  between  points  in  upper  left  and  lower 
right  before  scaling). 

If  we  actually  quantize  the  block  cosine  transform  of 
the  GIRL  for  an  average  rate  of  1 bit/pixel  we  would  get  the 
result  shown  in  figure  3 (after  inverse  cosine 
transforming).  The  normalized  mean  square  error  (NMSE)  in 
this  case  is  0.395%.  If  one  examines  the  picture  carefully, 
particularly  at  edges,  small  8x8  blocks  can  be  detected. 
These  blocks  become  visible  when  errors  in  quantizing  the 
D.C.  term  for  that  cell  become  large  enough  to  shift  the 
average  grey  level  a detectible  amount.  If  we  code  the 
image  with  a block  size  equal  to  the  image  size,  obviously, 
this  problem  disappears.  Figure  4 contains  such  an  image. 
In  this  case  the  bit  rate  was  0.7  and  the  NMSE  was  0.33%. 
Thus,  we  have  reduced  the  bit  rate  by  30%  while  decreasing 
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Fig  4.  Cosine  Coded  .7  bit/pixel 
(256  x 256  blocksize) 

NMSE  = . 337» 


Fig  5.  Psychovisual  Cosine  Coded 
1 bit/nixel 
(8x8  blocksize) 

NMSE  - .434% 


Fig  6.  Psychovisual  Cosine  Coded 
.7  bit/pixel 
(256  x 256  blocksize) 

NMSE  - .35% 
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the  NMSE  about  20%. 


► 


In  the  previous  paragraph  we  considered  an  image  which 
had  not  been  preprocessed  by  the  human  visual  system  model. 
I f we  now  preprocess  the  GIRL  with  the  HVS  and  8x8  block 
code  as  shown  in  figure  1 we  obtain  the  result  shown  in 
figure  5.  This  image  was  coded  at  1 bit/pixel  and  the  NMSE 
was  0.434%.  Although  the  NMSE  is  larger  than  the  8x8 
block  result  in  figure  3,  a close  inspection  reveals  that 
the  8x8  subblocks  are  not  as  visible  in  figure  5.  Again, 
a 256  x 256  block  size  was  used  and  the  result  is  shown  in 
figure  6.  In  this  case  the  rate  was  0.7  bit/pixel  and  the 
NMSE  was  0.35%.  The  reduction  in  NMSE  is  again 
approximately  20%  with  a 30%  decrease  in  bit  rate. 

If  one  refers  to  figure  1 and  considers  the 
implementation  of  the  HVS  linear  filter  is  in  the  Fourier 
domain  the  logical  question  is,  "why  not  code  in  the 
filtered  Fourier  domain?"  This  coding  scheme  eliminates  the 
cosine  transforming  completely  and  also  eliminates  a forward 
and  an  inverse  Fourier  transform.  The  system  is  therefore 
reduced  to  that  shown  in  figure  7. 

The  GIRL  picture  was  coded  in  this  manner.  The  result 
is  shown  in  figure  8.  This  image  has  a rate  of  1 bit/pixel 
and  a NMSE  of  0.26%.  The  quality  of  this  image  compared  to 
the  other  results  within  this  paper  indicates  that  coding 
the  HVS  filtered  Fourier  domain  (as  proposed  in  [7])  results 
in  excellent  quality  images.  The  technique  is  being 
extended  to  color  imagery  as  proposed  in  (7]  and  initial 
results  are  encouraging. 
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3.4  Statistical  Analysis  of  a Model  of  the  Human  Visual 
System  (Supported  by  WPAFB  under  Contract  F-33615-77-C-1016) 

Charles  F.  Hall  and  Lloyd  R.  Welch 


In  a recent  paper  Hall  and  Hall  have  discussed  several 
models  for  the  HVS  [ 1 ) . In  this  section  we  will  analyze  one 
of  these  models  (shown  in  figure  1),  both  statistically  and 
deterministically.  The  main  objective  will  be  to  determine 
the  assumptions  which  may  be  made  about  the  characteristics 
of  images  after  they  are  passed  through  the  model.  In 
particular,  those  characteristics  consistent  with  the 
assumptions  must  be  made  to  apply  the  known  solutions  to  the 
parametric  set  of  equations  which  are  the  heart  of 
rate-distortion  theory  (2,3). 

We  will  begin  by  first  considering  the  assumption  of 
Gaussian  pdf.  The  image  will  be  assumed  to  be  Gaussian 
after  passing  through  the  nonlinearity.  This  assumption 
presents  no  problem  since  the  filter  of  figure  1 is  linear 
and  hence  the  output  of  the  filter  will  be  Gaussian  if  the 
input  is.  When  the  output  of  the  nonlinearity  is  Gaussian, 
what  is  the  pdf  of  the  input?  This  question  is  answered 
quite  simply  by  applyinq  a fundamental  theorem  discussed  in 
section  5-2  of  Papoulis  (4). 
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Figure  1.  Nonlinear  model  of  HVS. 
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Figure  2. 
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For  the  analysis  consider  figure  2.  Let  x be  the 
Gaussian  distributed  image.  Fro*  figure  2, 


y - g(x)  - e* 


also 


f (•)  - e"(x'p)2/2°J 

zxK  ’ THa  e 


Solving  for  x in  terms  of  y 


also 


Thus 


- In  y^ 


g’(x)  - e* 


£«(•)  " 


fx<V 

vr:^n 


becomes 


1 

VTnay^ 


(In  y1-u)2/2o2 


*1  i 


> 0 (5) 


This  pdf  is  known  as  the  lognormal  distribution  and  has 
several  interesting  characteristics  [5].  Plots  of  this 
function  for  several  values  of  p and  o2  are  shown  in  figure 
3.  The  similarity  to  image  histogram  data  is  immediately 
apparent.  The  histogram  data  for  the  Kodak  GIRL  image  was 
plotted  on  log  probability  paper  and  is  shown  in  figure  4. 
Note  that  the  data  points  are  essentially  three  straight 
lines  over  the  1%  to  99%  range.  This  indicates  the  data  is 
strongly  lognormal.  Thus,  we  see  the  HVS  models  help 
satisfy  the  common  assumption  (which  is  unrealistic  for  an 
unprocessed  image)  that  the  data  is  Gaussian. 

Next,  let  us  consider  the  entropy  of  the  two  processes. 
We  will  use  the  common  definition  for  differentiated  entropy 
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H(x)  - -J  p(x)  in  p(x)  dx  (6) 


where  p ( • ) denotes  the  pdf  of  x.  Shannon  has  shown  for  the 
Gaussian  case  we  get  [6] 

H(x)  ” /2tt eo  (7) 

Consider  the  lognormal  distribution, 

P(x)  " o7Hx  e'(ln  X_p)2/2°2  (8) 


In  p(x)  - -ln[o/Z¥x]  - iln  <9> 


H(x)  - | p(x)  j\n (a/SFx)  + dx  (10) 

where  the  lower  limit  of  integration  has  been  changed  to  0 
since  x ranges  from  0 to  « for  the  lognormal  pdf.  Thus, 

H(x)  =*  | p(x)ln(o/2?)dx  + J p(x)ln  x dx 

° „ 2 ° (U) 

+ f p(x)  dx 

> - la2 


therefore 


f p(x)dx  - 1 
1 0 

H(x)  - ln(o/2ir)  + J p(x)ln  x dx  + 

jo 

f p(«)  U&  YH>*  dx 

J - la1 
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Now  let  y * 1 nx  which  implies  x * ey  and  that  dx  * eYdy. 
Also,  when  x * 0,  y * - <»  and  when  x * <»,  y * <*> . 
Substituting  into  eq. (8)  gives 


' /Tiro  exp(y) 


(y-u)2/2a2  (14) 


Completing  the  substitution  in  H gives 


•)  - in (o/7?)  + f”  __z_  e~(y-u)2/2a2 


(y-u)2  e-(y-u)2/2o2 


The  first  integral  is  just  the  mean  of  a Gaussian  pdf  and 
the  second  integral  is  the  variance,  therefore. 


H(-)  ■ ln(o/27)  4-  y + — o2 


ln(o/Zire)  + u (16) 


Thus,  for  a nonzero p we  have  an  entropy  chanqn  qoing  through 
the  system  which  is  equal  to  u , the  mean  of  the  resultant 
Gaussian  pdf. 

Next  we  will  consider  the  autocorrelation  and  power 
spectrum  for  such  a system.  Now 


Rx(t>  - E{x(t)x(t+r) } 


■rw 


- eyiey2.-%(y-v.)Tc-‘(5-u) 


where  N is  the  dimension  of  the  system  and  equals  2 for  our 

case.  Also, 

y " <*1  X2>  (18) 
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a nd 


O Py(T) 


°2Py(T) 


(19) 


In  this  case  P (T)  is  the  normalized  autocovariance  defined 
P..(t)  = Etl  y(t)-u  1 I vft+O-ull 

(20) 


py(t)  = E |[y(t) -u]  [y(t+x)-y]| 
Now  let  Xj  * x2  * -j  and  rewrite  rx(t)» 

»x(t)  - Eje^V^j 


(21) 


II 


dy1  dy2 


/2W  |C|% 

The  above  equation  is  in  the  form  of  a characteristic 
equation.  The  characteristic  function  for  a two-dimensional 
Gaussian  of  nonzero  mean  is  (see  [4],  p.  255) 


♦ (1)  - e"^Tc* 


(22) 


where  X « (-j  -j) , therefore, 

.2 


R^t)  - - e°  C1+Py(T>]+n(t)+ii(t+T)  (23) 


For  the  input  process  x,  we  will  assume  the  general 
form  for  the  autocorrelation  in  terms  of  the  covariance  of 
the  process, 

VT>  - Mi  + Cx(t)  - „*  + o*  px(t)  <24) 

Then,  from  eq. (23)  we  have 
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„ , . 2.2  /x  2vy+  °y^1+Pv(T)1  (25) 

rv(t)  - u2  + a2  Pv(t)  = e y y y 


„ Vi  +%a 2 

E( x}  * ux  =»  E{ey}  **  e y y 


therefore 


and  substituting  in  eq. (25)  yields 


4 + °x  <>x<T> 


2 °i  py(T) 
y2  e 7 y 


n , . °y  P„(T) 

px(t)  =*  e y y 


Expanding  the  right  side  of  this  equation  gives 

„ V"''  9b  P.,R, 


px(t)  - 1 + o2  py(T>  + 


The  sun  of  eq.(30)  represents  the  error  if  we  use  only  the 

first  two  terms  of  the  expansion.  The  normalized  covariance 

of  any  process  will  have  an  upper  bound  of  1.  Also,  for 

typical  picture  data  o2*0.5.  Thus,  the  worst  case 
05  y 

expansion  is  on  e ’ and  the  error  in  taking  only  the  first 
two  terms  is  less  than  101.  This  is  a very  conservative 
error  bound,  particularly  since  it  assumes  the  data  are 
completely  correlated.  We  will  neglect  the  error  term  in 
eq. (30) , giving 


- uf  ’ 1 + °'y  »y<*>  <»> 

From  eq.(29),  if  we  take  the  natural  logarithm  of  both 

sides, 

f o* 

lr‘  ! 1 + 4 P*(T)J  " °i  py(T)  <32> 
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Typical  image  data  for  the  x process  gives  a ^ ratio  of 
0.16.  Again  for  a worst  case  px(t)x»  * we  9et 
ln(l+0.16)  ■ 0.14842  which  is  within  8%  of  0.16.  Thus, 
within  experimental  error 


, o* 

'r‘  ! 1 + jpr~  Px(t)  * ~r 
x px 


PX<T>  " °y  Py<T) 


Now  px  (t  ) ■ 1 for  t » 0 for  any  valid  covariance  function, 
therefore 


Oe^dT 


Substituting  eq. (34)  into  eq.(31)  gives 

1 + °y  px(t)  " 1 + °y  Py(T)  (35) 
which  implies  that  p (r)  «p  (r).  Thus,  the  output 

x y 

autocorrelation  becomes 

Ry(T)  « P*  + oj  PX(T)  (36) 

By  definition  the  power  spectrum  of  y process  is 

sy(u»)  - | Ry(T)e’JWTdT  (37) 

Therefore,  (“ 

sy(<*>)  “ j [viy  + Oy  PxCt ) J e“^turdT 

» CD 

- 2iryy  6 (w)  + o*  J PX(T>  e"^“Tdx 

This  relationship  is  of  great  importance  in  rate-distortion 
applications.  Given  an  input  autocorrelation,  we  can 
compute  the  output  power  spectrum  which  can  be  used  in  the 
equations  (see  (2],  p.  117) 

D8  “ 27  } troln  8'  •<«*)]  du  (39) 


Therefor 
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and 


R(V  " Kv  I t”1**  01  1°8^Q!^-]dw  (40 


Thus,  the  rate-distortion  curve  of  the  source  x,  after 
passing  through  the  logarithmic  nonlinearity,  can  be  easily 
obta ined. 

A common  assumption  for  imagery  data  is  that  it  is 
Markov,  in  particular,  , , 

p (x)  - e-a  I T I 

e (41) 

Substituting  this  form  into  the  power  spectrum  equation 
g ives 


Sy(u>)  “ °y 


J e-o|T|e-J(0tdT  + 2nwZ  6(w) 


2a  o2. 

i + 2irp2  6 (oj) 

a2+  w*  y 


(42) 


Referring  to  figure  1 and  letting  H(w)  be  the  linear  filter 
function  we  see  that 


sz<“>  " Sy(u)  “ 


2ocv  , 

— — ^ + 2irp*  6 (u) 

a2+w2  y 


I H(u) | (43) 


We  have  shown  that  if  an  image  source  is  lognormal  and 
Markov,  then  after  passing  through  the  HVS  model  of  figure  1 
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it  will  be  Gaussian  Markov  with  a power  spectrum  defined  by 
eq.(43).  Furthermore,  the  entropy  of  the  original  source 
will  be  changed  by  w , the  mean  of  the  resultant  Gaussian 
pdf.  These  results  are  being  used  to  find  theoretical 
bounds  on  the  coding  improvements  which  can  be  realized  when 
images  are  preprocessed  with  models  of  the  HVS. 
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3.5  A Technique  of  A Posteriori  Restoration 


John  Morton 


This  effort  is  concerned  with  restoring  a degraded 
image  while  assuming  a minimum  of  a priori  information.  The 
previous  reports  [1-2]  have  outlined  the  approach  to  be 
investigated  and  in  addition,  contain  summaries  of  the 
progress  achieved  up  to  the  respective  report  dates.  This 
report  will  present  some  preliminary  results. 

Figure  1 contains  four  statistically  similar  images, 
statistically  similar  in  the  sense  that  the  power  spectra 
corresponding  to  the  four  images  are  similar.  Note  that  the 
subject  content  of  one  image  is  different  from  the  other 
images.  The  difference  in  subject  matter  was  intentional  to 
emphasize  the  point  that  different  image  subject  matter  may 
corrrespond  to  similar  power  spectra. 

A contour  plot  of  the  log  of  the  power  spectrum 
calculated  from  figure  la  is  presented  in  Figure  2.  Figure. 
3 contains  the  log  of  the  average  power  spectrum 
calculated  from  figures  lb,  lc,  and  Id.  Note  the  close 
similarity  between  figures  2 and  3.  The  power  spectrum 
depicted  in  figure  3 will  be  used  to  estimate  the  magnitude 
of  the  optical  transfer  function  (OTF)  via  the  method  of 
Cannon  [3]. 

Figure  4 contains  a perspective  plot  of  the  point 
spread  function  (PSF)  to  be  used  to  degrade  the  image 
presented  in  figure  la.  Figures  5,  6 and  7 contain  the 
image  of  figure  la  degraded  by  the  PSF  of  figure  4 together 
with  restorations  assuming  different  combinations  of  a 
priori  information.  Table  1 defines  the  assumptions 
associated  with  the  different  restorations. 
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a)  Figure  la  degraded  by  point  spread  function  of  Figure  4 
b),  c),  and  d)  restorations  see  Table  1. 


Figure  6.  a),  b),  c),  and  d)  restorations  see  Table  1. 
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If  one  has  knowledge  of  the  OTF  and  also  knowledge  of 
the  power  spectrum  of  the  undegraded  Image,  one  can  obtain 
the  restoration  of  figure  5b.  Figure  5c  assumes  knowledge 
of  the  magnitude  of  the  OTF,  knowledge  of  the  power  spectrum 
of  the  undegraded  image,  and  an  estimate  of  the  phase  of  the 
OTF  of  zero  everywhere.  Using  the  algorithm  previously 
reported  [1-2]  to  estimate  the  phase  of  the  OTF  and  assuming 
knowledge  of  the  magnitude  of  the  OTF  and  knowledge  of  the 
power  spectrum  of  the  undegraded  image,  one  obtains  figure 
5d.  Upon  comparison  of  figures  5c  and  5d  it  is  apparent 
that  the  phase  estimate  per  se  does  not  afford  any 
improvement  over  a phase  estimate  of  zero  everywhere. 

Upon  examination  of  figure  6a  where  the  magnitude  of 
the  OTF  is  estimated  using  the  power  spectrum  depicted  in 
figure  3,  and  where  in  addition  the  power  spectrum  of  Figure 
3 is  used  as  an  estimate  of  the  power  spectrum  of  the 
undegraded  image,  and  lastly  assuming  the  phase  of  the  OTF 
is  known,  it  is  evident  that  estimation  of  the  magnitude  of 
the  OTF  is  very  good. 

Figure  6b  assumes  no  a priori  knowledge.  Figure  6c 
assumes  no  apriori  knowledge  of  the  magnitude  of  the  OTF, 
nor  any  knowledge  of  the  power  spectrum  of  the  undegraded 
image  and  assumes  an  estimate  of  zero  everywhere  as  an 
estimate  of  the  phase  of  the  OTF. 

The  phase  estimate  can  be  shown  experimentally  to 
converge  to  a phase  which  is  of  incorrect  sign  and  to  a 
value  less  than,  in  some  cases  much  less  than,  the  correct 
value.  Thus,  It  is  not  unrealistic  to  change  the  sign  of 
the  estimated  phase  and  boost  the  estimates  by  factors. 
Figures  6d,  7a,  7b,  and  7c  correspond  to  the  phase  estimates 
with  a sign  change  and  multiplications  of  1,  2,  3,  and  4 
respectively.  As  judged  by  the  increased  sharpness  in  the 
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collar  of  the  nan  and  leas  distortion  in  the  face  of  the 
nan,  perhaps  sone  inprovenent  has  been  obtained. 
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3.6  On-Axis  Optical  Filtering  System  (Supported  by  NSF  Grant 
ENG-76-15318  and  AFOSR  under  Contract  AFOSR-77-3285) 

Alexander  A.  Sawchuk  and  Chung-Kai  Hsueh 


Computer  generated  holograms  have  many  advantages  over 
optical  generated  holograms.  If  the  hologram  is  used  in  a 
filtering  system,  the  filter  H ( f ) can  be  specified  without  a 
physical  specimen  of  the  impulse  response.  Computer 
generated  holograms  also  eliminate  the  need  of  very  stable 
and  low  noise  photographic  recording  set  up.  With  computer 
plotting,  the  nonlinearity  of  the  film  and  other  effects  of 
the  system  can  be  precompensated  and  more  flexibility  can  be 
achieved . 

However  most  of  the  computer  generated  holograms  are 
off-axis.  The  desired  output  appears  on  the  first 
diffraction  order  which  has  a maximum  diffraction  efficiency 
of  41%  when  square  wave  phase  grating  is  used.  Since  most 
of  the  energy  is  concentrated  at  the  center,  to  get  useful 
results,  we  have  to  increase  the  carrier  frequency  and 
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restrict  the  size  of  the  impulse  response  and  the  object  to 
avoid  overlapping  between  two  adjacent  diffraction  orders. 

On-axis  holograms,  on  the  other  hand,  do  not  have  the 
problems  mentioned  above.  The  desired  output  is  on  the  axis 
and  contains  most  of  the  energy.  As  will  be  explained  later 
if  the  high  order  diffractions  can  be  eliminated  properly 
the  whole  output  plane  is  available  and  the  size  of  the 
object  need  not  be  restricted.  The  sampling  rate  of  the 
filter  then  depends  on  the  extent  of  the  impulse  response 
rather  than  on  the  object  extent  which  is  generally  much 
larger.  Therefore  the  sampling  rate  is  much  lower  compared 
to  other  holographic  type  filters.  However  for  proper 
suppression  of  the  higher  diffraction  orders,  the  sampling 
rate  still  has  to  be  considerably  high  as  will  become  clear 
in  the  following  analysis. 

To  simplify  the  notation,  we  use  one-dimensional 
variables  in  the  following  formulation.  The  complex  filter 
H (u)  is  designed  to  have  coherent  impulse  response  h(x)  and 
incoherent  response  | h ( x ) | 2 in  the  image  plane.  Denoting 
the  Fourier  transform  of  h(x)  by  H(u),  the  digital  hologram 
Hg(u)  approximates  H(u)  by  a series  of  weighted  pulses 

Hg(u)  - |^H(n6u)6(u-n6u)j  * p(u)  j • M(u)  (1) 

shown  in  figure  1.  The  pulse  spacing  is  6u,  each  pulse  has 
shape  p(u)  with  weights  given  by  sampled  values  H{n<su).  In 
general  the  pulse  has  a rectangular  shape  with  width  <5u, 
i.e. , 


p(u)  - rmct 


(2) 


The  function  M(u)  is  a mask  representing  the  physical  size 
limit  of  the  hologram.  If  we  assume  that  H(u)  is 
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bandlimited  and  the  size  of  the  hologram  is  large  enough  to 
cover  the  whole  spectrum,  then  we  may  drop  the  mask 
function.  Thus  (1)  becomes 


■*,'*>  ■ [H‘u>-re  «■*>  (&)]  * ««(&)  »> 

where 

comb(u)  - £$(u-n)  (4) 

n 

When  this  hologram  is  put  in  the  filtering  system  we  obtain 
the  impulse  response  hg(x)  which  is  the  inverse  Fourier 
transform  of  Hg(u) 

v*>  - ’'Mv”)} 

- [h(x)  * comb(<5ux)  J • (6u)sinc(6ux)  (5) 

or 

hs s*nc(6ux)  (6) 

The  total  output  is  a displaced  series  of  impulse 
responses  of  the  form  h(x) , weighted  by  a sinc(6ux)  factor 

ae  shown  in  figure  2.  Notice  that  the  higher  diffraction 
orders  fall  on  the  zeroes  of  sinc(6ux)  functions  and 
therefore  are  strongly  attenuated.  In  general,  these  higher 
order  responses  can  be  neglected  and  the  whole  image  plane 
is  available.  Therefore  the  size  of  the  object  is  no  longer 
restricted  and  the  sampling  interval  of  the  filter  is 
determined  by  the  extent  of  the  impulse  response  x n rather 
than  by  the  extent  of  the  object  xQ.  We  need 
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6u  < zr 
*n 


(7) 


However,  to  properly  eliminate  the  higher  order  responses, 
we  require  the  higher  order  responses  to  be  concentrated 
around  the  zeroes  of  the  sinc(£ux)  function  and  <$u  is 
generally  smaller  than  the  requirement  imposed  by  (7). 


If  the  higher  order  responses  cannot  be  neglected,  the 
sampling  rate  must  be  much  higher.  To  avoid  overlapping 
between  adjacent  orders  we  require 

^ > xn+xo  (8) 


Generally  x is  much  smaller  than  x„  so  this  condition 
n o 

becomes 


fiu  > 


x0 


(9) 


The  sampling  interval  is  determined  by  the  extent  of  the 
object  rather  than  the  impulse  response. 


To  further  reduce  the  intensity  of  higher  diffraction 
orders,  we  may  use  a smoother  microdensitometer  plot.  For 
example  if  a simple  linear  interpolator  is  used  then  the 
impulse  response  becomes 

hg(x)  - [£h(x-~)]  sinc2(6ux)  (10) 

Additional  sidelobe  reduction  is  achieved  by  the 
sinc2(6ux)  factor.  Basically  this  problem  is  similar  to  the 
apodization  problem  in  which  a proper  window  is  used  in 
order  to  reduce  the  sidelobes.  However  these  different 
profiles  may  be  difficult  to  implement  by  optical  plotting. 
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Besides  the  reduction  of  the  sidelobes  the  sine 
function  also  modifies  the  impulse  response  slightly  due  to 
the  modulation  near  the  peak  of  sinc^ux),  but  this  effect 
is  small.  If  desired,  the  function  h(x)  could  be 
premultiplied  by  a window  function  correction  term.  The 
transform  of 


h'  <*>  - .InctiL)  I *1  < V2  <U> 

could  be  plotted  so  that  h(x)  would  appear  correctly.  This 
modification  also  increases  the  intensity  in  the  higher 
diffraction  orders.  However  this  effect  is  small  due  to  the 
fact  that  high  diffraction  orders  are  strongly  attenuated  by 
the  zeroes  of  the  sine  function  as  illustrated  in  figure  2. 

To  achieve  on-axis  complex  operations,  several  new 
techniques  have  been  proposed.  The  most  straightforward 
method  is  to  plot  the  amplitude  and  phase  of  the  transfer 
function  separately.  The  phase  plot  is  bleached  and  then 
superimposed  on  the  amplitude  plot.  The  registration 
problem  can  be  simplified  by  plotting  grating  patterns 
outside  the  aperture.  The  Moire  fringes  produced  by  these 
patterns  can  be  used  for  alignment.  The  sandwich  of  the  two 
holograms  is  then  permanently  fixed  together. 

Other  on-axis  holograms  including  kinoform  [1]  and 
ROACH  [2]  have  been  used.  Kinoforms  are  made  by  discarding 
the  amplitude  of  the  transform  of  a diffused  object.  Due  to 
the  lack  of  amplitude  information,  degradation  occurs  in  the 
reconstruction.  Besides  using  a random  phase  diffuser  or 
deterministic  diffuser  [3]  to  even  the  spectrum,  several 
iterative  methods  [2,4]  cat)  be  used  to  achieve  a flat 
spectrum.  These  methods  are  convergent  in  general,  however, 
we  do  not  expect  to  get  perfect  reconstruction  in  a finite 


-156- 


i wjyytTy^:'  ‘ ‘ *#srf  v r- 


r 

number  of  steps.  Therefore  instead  of  discarding  the  slow 
amplitude  variation  of  the  spectrum,  we  may  implement  it  as 
a sandwich  hologram.  Since  the  slowly  varying  amplitude  has 
a low  dynamic  range,  the  limited  dynamic  range  of  the  film 
is  no  longer  a problem.  The  ROACH  stores  amplitude  and 
phase  information  on  different  layers  of  the  color  film. 
Theoretically  it  is  a perfect  filter  without  any 
registration  problems.  However  it  suffers  from  the  low 
space-bandwidth  product. 


A two-kinoform  system  has  been  proposed  [5]  which  sums 

up  the  responses  due  to  two  kinoforms.  Therefore  the  light 
efficiency  is  improved  while  the  correct  amplitude 
information  is  preserved.  Suppose  the  amplitude  of  the 
transfer  function  is  normalized  to  unity,  then  each  vector 
AeJ  in  the  unit  circle  can  be  decomposed  into  two  vectors 
with  lengths  of  1/2  so  that 

Ae-*e  - y e-Ha-ty)  + 1 eJ  <6 -^)  ( U) 

i>  - cos'^A  0 1 ♦ £ J (13) 


This  decompositon  and  the  optical  set  up  have  been  shown  in 
figure  1 and  figure  2 of  reference  5 except  the  circle 
should  be  a unit  circle. 


Similar  ideas  have  been  discussed  by  Severcan  (6]  and 
Shmarev  [7].  Severcan  has  discussed  a two-channel  phase 
only  filtering  system.  In  the  two-channel  phase  only 
filtering  system  he  used  a grating  to  modulate  the  input 
signal  into  two  higher  frequency  channels.  Two  filters 
corresponding  to  the  two  kinoforms  used  are  placed  in  these 
channels.  To  extract  the  desired  output,  one  requires 
another  grating  and  low  pass  filter  in  another  imaging 
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system.  This  system  has  diffraction  efficiency  of  66%  and 
suffers  from  the  requirement  of  very  accurate  alignment  of 
the  second  grating.  In  the  on-axis  phase  only  filtering 
system,  he  plotted  the  phase  only  function  side  by  side  in  a 
cell.  Upon  reconstruction  the  noise  due  to  the  parity  term 
can  be  neglected  for  the  zero-th  diffraction  order,  however, 
this  noise  term  dominates  in  the  first  diffraction  order. 
To  get  useful  results  we  have  to  either  limit  the  object 
size  or  increase  the  sampling  rate  for  the  impulse  response. 

In  a recent  paper  [7]  Shmarev  discussed  a similar 
kinoform  system  using  a balanced  grating.  However  he 
decomposed  the  vector  into  two  unit  vectors,  thereby 
reducing  the  light  efficiency  by  50%.  The  use  of  the 
grating  further  reduces  the  intensity  of  the  output. 

In  general,  there  are  many  ways  to  decompose  a vector 
in  the  unit  circle  into  two  constant  vectors.  A vector  Ae-^9 
in  the  unit  circle  can  be  written  as 

AeJ0  _ " |y<e+*)  + e^0-"0]  0 < A < 1 (14) 

where 

♦ - cos-1  [A/n]  n > 1 (15) 


In  the  two-kinoform  system  n«l  and  the  results  are 
eqs.(12)  and  (13).  Shmarev  uses  n-2  and  obtains 


A«J®  - + eJ 


(16) 


where 


♦ • co*_1A/2 


I £ ♦ 


(17) 
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In  the  set  up  shown  in  figure  2 of  reference  5,  if  both 
channels  have  the  same  intensities,  then  every  decomposition 
for  a particular  n would  give  the  correct  vector  except  a 
constant  factor.  For  example  n*2  results  in  a vector  with 
half  of  the  magnitude  of  the  vector  resulted  from  n*l. 
Light  efficiency  i6  therefore  reduced  for  n>l.  Furthermore, 
for  larger  n,  P varies  over  a smaller  range  as  illustrated 
by  eq. (17) . For  computer  generated  holograms  this  means 
more  quantization  levels  in  order  to  obtain  the  same  amount 
of  accuracy.  Therefore  it  is  better  to  use  the 

decomposition  with  n*l. 

When  n-1;  the  angle  4*  is  related  to  amplitude  A by 
A = cos  0 < < j (18) 

Its  derivative  is  given  by 

A'  - sin  0 < < j (19) 

For  a computer  generated  hologram,  <1*  is  quantized  to  a 
* A A 
certain  level  4*  and  the  amplitude  becomes  A*cos  \p  rather  than 

A.  The  derivative  of  A is  an  increasing  function  with  the 

minimum  at  iji-O  and  the  maximum  at  ip*- rr/2 . Equivalently,  the 

same  amount  of  quantization  error  in  ip  results  in  small 

error  in  A when  A*1  (i|ia0)  and  larger  error  in  A when  A*0 

(t(isn/2).  Therefore  we  can  conclude  that  the  phase  coding 

techniques  can  be  applied  here  to  even  the  spectrum  and 

therefore  to  reduce  the  quantization  error. 

In  the  above  analysis  the  intensity  of  the  impulse 
response  is  of  main  concern.  We  can  thus  take  advantage  of 
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allowing  the  phase  of  the  impulse  response  to  be  arbitrary 
in  order  to  obtain  the  desired  transform  distribution.  One 
particular  application  is  the  continuous  display  of  the 
discrete  data  (5) . The  interpolators  used  for  these 
purposes  are  separable  in  general.  As  a matter  of  fact,  the 
separated  functions  have  the  same  form.  Therefore  phase 
coding  can  be  done  in  one  dimension  and  the  final  result  is 
the  product  of  two  one-dimensional  functions.  If  Hirsch's 
method  (1J  is  used  for  phase  coding  of  an  N x N picture  with 


m iterations,  the  number  of  multiplications  needed  is 

Nj  - 2m  • 2N2logN2  - 2N(2m  • 2NlogN)  (20) 

When  the  function  is  separable,  the  number  of 

multiplications  becomes 

N2  - 2m  • 2NlogN  + N2  (21) 

If  m is  large,  the  second  term  of  N2  can  be  neglected.  The 


computing  time  is  then  reduced  by  a factor  of  2N.  In  fact 
the  N2  operations  are  simply  additions  for  kinoforms. 

Although  the  phase  coding  methods  have  significantly 
increased  the  light  efficiency  and  reduced  the  quantization 
error,  the  phase  associated  with  the  impulse  function  tends 
to  be  irregular.  This  might  result  in  'built-in  speckle' 
noise  even  if  incoherent  processing  is  used.  Future  work 
will  include  the  study  and  Implementation  of  other  phase 
coding  methods  which  would  result  in  less  noise. 
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I ntroduction 


The  use  of  pseudocolor  to  Improve  the  Information 
transfer  from  a display  to  the  human  observer  is  well 
established.  Although  its  potential  applications  are 
numerous,  pseudocolor  has  to  date  been  used  almost 
exclusively  to  encode  image  intensity  variations.  Another 
application  that  has  been  considered  is  to  encode  variations 
in  spatial  frequency  content  of  an  image  by  pseudocoloring 
in  the  Fourier  transform  domain  (1].  Although  this  showed 
great  promise  for  various  applications  such  as  texture 
analysis  it  was  never  actively  pursued.  The  reason  for  this 
is  the  difficulty  and  expense  of  implementing  Fourier  domain 
pseudocoloring  in  a digital  system.  This  is  because  it 
involves  one  forward  two-dimensional  Fourier  transform, 
followed  by  the  generation  of  three  color  filter  planes  each 
of  which  must  finally  be  inverse  Fourier  transformed  to 
generate  the  final  color  image.  All  of  this  is  very  time 
consuming  on  a digital  computer.  However  this  entire 
process  can  be  very  easily  implemented  optically.  The  basic 
idea  is  to  have  a spatial  filtering  setup  which  uses  a 
multi-color  filter  in  the  Fourier  plane  to  encode  different 
spatial  frequencies  with  different  colors.  Such  a system' 
can  be  developed  using  coherent,  incoherent,  or  partially 
coherent  illumination.  Preliminary  analysis  indicates  that 
this  is  one  application  where  a partially  coherent  system 
can  be  used  to  particular  advantage,  combining  the  best 
features  of  coherent  and  incoherent  systems  without  many  of 
the  disadvantages  of  the  respective  systems.  In  particular, 
a properly  designed  system  using  partially  coherent 
illumination  will  result  in  a system  that  combines  good 
chromatic  differentiation  of  various  spatial  frequency 
components  with  excellent  signal-to-noise  ratio 
characteristics. 

In  the  following  section  we  present  a theoretical 
analysis  of  a Fourier  plane  pseudocolor  system  with 
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partially  coherent  illumination  and  study  the  effects  of 
going  to  the  coherent  or  incoherent  limits. 

Theory 

Although  theoretical  analyses  of  partially  coherent 
filtering  systems  are  straightforward  in  concept,  the 
resultant  calculations  are  usually  intractable  for  any 
general  treatment.  However  if  enough  restrictions  are 
placed  on  the  system,  a tractable  problem  can  be  found.  In 
this  section  we  wish  to  outline  the  analysis  of  a system 
which  is  admittedly  very  restricted,  but  which  gives  results 
which  are  easily  understandable  and  which  offer  at  least  a 
qualitative  explanation  of  the  experimental  systems 
described  here. 

The  system  we  wish  to  analyze  is  the  one-dimensional 
system  shown  in  figure  1.  Two-dimensional  analysis  does  not 
alter  the  results  in  essence  but  makes  the  evaluation  quite 
cumbersome  and  less  transparent.  The  source  will  be 
considered  to  be  spatially  incoherent  with  a temporal 
spectral  distribution  which  can  be  approximated  as  three 
monofrequencies  Vj^-c/A^,  v2«c/A2  and  v3“c/A3  which  we  will 
refer  to  as  color  1,  color  2 and  color  3 respectively.  The 
source  is  assumed  to  have  a uniform  intensity  over  the 
region  -yg  < u < w8. 

The  object  we  will  consider  has  the  following  amplitude 
transmittance  (independent  of  illumination  wavelength) 

V*>  " *0  + A1  c°*(2irVox)  (Is) 

where  An,  A.,  and  u are  real  constants, 
u i o 
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source 


object 


pupil 


image 


Figure  1 


tical  system.  An  extended 
lte  light  source  Is  used  to 
Illuminate  the  object.  A color 
filter  is  placed  in  the  pupil  plane. 
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Thus  the  intensity  of  the  object  is 


A?  A? 


2 nl  A1 

Iq(x)  * Ag  + — + 2AgA^  cos(2irugX)  + — cos  (4irugx) 


(lb) 


In  the  pupil  plane  we  have  basically  three  independent 
filters,  one  which  transmits  only  in  the  spectral  region  of 
V1  (color  1),  one  which  transmits  only  at  \>2  (color  2)  and 
one  which  transmits  only  at  v (color  3).  These  filters 
have  the  following  binary  transmittances  for  colors  1,  2 and 
3 respectively  (see  figure  2): 


where 


f^Cu)  - rect 


^(w)  " rect 


fjCu)  ■ rect 


rect 


(^)(i-”ct('^)) 

1 If  -Pc  < p < uc 

0 otherwise 


(2a) 


(2b) 


(2c) 


The  color  1 filter  f^  is  a low  pass  filter  with  cutoff 
frequency  uc  where  as  the  color  2 and  color  3 filters  are 
high-pasa  filters  with  the  cutoff  frequencies  2y  and  3u  . 

C 

Here  u represents  a spatial  frequency  variable.  The 
corresponding  variable  in  the  physical  system  is  where 
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2u. 


-3p. 


-2  y. 


Figure  2.  Three-color  filter 
In  the  pupil  plane.  The  coordinate  p 
represents  spatial  frequency  In  the 
linage  plane. 
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i.e.  u Must  be  scaled  by  the  appropriate 
illuaiination  wavelength  Aj. 

For  a spatially  incoherent  and  guasistonochrosiatic 
source,  we  can  write  the  Fourier  transform  of  the  image 
intensity  in  color  j as  follows  (2]t 

m 

!,<*)  - f j fj(u,)f*(M,-M)v(p,,)v*(M"-»i)S(M,-M,,)dM,dM"  (3) 

where  * denotes  complex  conjugate,  S»kP„  rect (jjp)  is  the 

source  intensity  and  v is  the  Fourier  transform  of  the 
object 

09 

v(ii)  “ c f u(x)  exp(-2iriux)dx 


From  eq. (la)  we  see  that 

v(u)  - c | (AQ+A1cos2nu0x)exp(-2iriyx)dx 
• 00 

" c [Ao6  (6  (m+Vq))]  (4) 

Substituting  eq. (4)  into  eq.(3)  gives  us  an  expression  for 
the  Fourier  transform  of  the  intensity  of  the  color  j.  If 
one  assumes  the  source  intensity  S and  the  filter  function 
fj  are  symmetric,  the  expression  simplifies  to  the  following 
(where  constant  factors  have  been  dropped): 

Ij(u)  ■ Oj(u0X(y)'*,^2~  [*(m-Uq)+4(m+Uq)] 

■f  jd (u-2ug)+g  (y+2pQ)j  (5) 

The  coefficients  are  defined  as  follows 
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^(Mq)  - ■ AQ  I 2 |’|fJ(»i,))2S(li,)dM,+(^-) 
j"  |fj(M,)|2S(li0-li,)du’ 

- m 

2 ,6‘’ 

■ *0 1 ^oj  <0>+(— | — yFoj  ‘“o’ 

where  Pgj(M0)  *•  th*  convolution  of  the  source  with  the 
filter  intensity  transmission  | f j 1 2 

Bj(u0)  - 2Re[AQA*  J f j <u’  )S(u 1 ) f*(u-g  • )du  ’] 


K*  1 'DD' 

AlFljJ 

where  Pj^ j is  the  filter  function  multiplied  by  the  source 
function  convolved  with  the  (complex  conjugated)  filter 
function. 


fj(u0)  vy,)f^2^,>s<v^d'J' 

^ -oo 


(6c) 


Prom  this  we  can  write  an  expression  for  the  intensity  of 
the  three  color  images  as 


tj(x’)  - J Ij  (w)exp(-2iriyx’  )dg 


(7) 

" “j (w0)+Bj (woJcosCZxupx'J+Yj (upjcostAnpQX1 ) 

It  should  be  remembered  that  the  expressions  for  , 

Bj,  and  yj  are  valid  only  for  the  particular  input  object  we 
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have  considered.  Keeping  this  in  mind  we  will  refer  to  , 
6 j , and  Yj  as  pseudo-transfer  functions  for  each  of  the 
three  components  of  eq. (7) . In  figure  3 we  plot  these 
pseudo-transfer  functions  for  the  filter  described  by  eq. (2) 
and  for  three  different  source  sizes.  Figure  3a  represents 
the  limiting  case  as  the  source  size  approaches  zero, 
i.e.  the  coherent  illumination  limit.  In  this  case  we  6ee 
that  there  is  always  a certain  bias  of  color  1 in  the  image, 
but  otherwise  the  color  of  the  image  is  going  to  be  purely 
color  1,  2,  or  3 depending  upon  the  spatial  frequency  W0  of 
the  object.  Thus  we  would  expect  a test  target  made  up  of 
sinusoidal  gratings  to  have  a highly  saturated  color  image. 
The  problem  with  such  a system  is  that  it  is  subject  to  all 
the  noise  problems  of  a coherent  systems  such  as  speckle  and 
diffraction  from  dust,  etc.  Figure  3b  shows  the 
pseudo-transfer  functions  for  a partially  coherent  system 
where  the  image  of  the  source  in  the  filter  plane  is  half 
the  size  of  the  central  filter  (color  1).  Although  there  is 
some  overlap  of  the  pseudo-transfer  functions  for  different 
colors,  in  general,  this  system  is  seen  to  also  yield  fairly 
saturated  colors.  It  is  important  to  note  that  unlike  the 
coherent  system,  this  partially  coherent  system  yields 
slightly  different  colors  and  varying  degrees  of  modulation 
for  two  gratings  at  different  but  approximately  equal 
frequencies.  Finally,  figure  3c  shows  the  pseudo-transfer 
functions  for  the  limiting  case  as  u_  becomes  very  large, 

O 

i.e.  spatially  incoherent  illumination.  In  this  case  Bj  is 
the  standard  modulation  transfer  function  (MTF)  for  the 
incoherent  imaging  system,  ctj  and  Yj  follow  directly  from 
Bj.  The  important  thing  to  note  is  that  there  is  a constant 
bias  term  with  equal  contributions  from  all  three  colors 
(independent  of  liQ ) • This  implies  poor  saturation  in  the 
color  image  of  a sinusoidal  grating.  Thus  in  many  respects, 
the  partially  coherent  system  combines  the  good  qualities  of 
both  the  coherent  and  incoherent  systems,  specifically  good 
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saturation  with  low  noise  and  good  color  differentiation  of 
adjacent  spatial  frequencies.  It  should  be  noted  that,  as 
can  be  seen  from  the  plots  of  3 j , the  fundamental  frequency 
term  tends  to  be  eliminated  in  the  coherent  and  partially 
coherent  systems.  This  leads  to  false  imaging  effects  for 
these  systems  which  can  be  disturbing  in  certain 
applications. 

The  color  saturation  in  the  image  is  generally  high  as 
long  as  the  image  of  the  source  is  confined  to  one  color  in 
the  filter  plane.  Thus  a source  intensity  distribution 
described  by  eqs.(2a),  (2b),  or  (2c)  would  give  good 
saturation  with  those  filters. 

Conclusions 


We  have  discussed  an  optical  system  for  color  encoding 
of  the  spatial  frequency  content  of  an  image.  A theoretical 
analysis  is  given  for  the  general  partially  coherent 
illumination  case.  Preliminary  experimental  work  has 
verified  the  theoretical  results.  This  work  will  be 
reported  on  later. 

It  is  felt  that  this  system  could  have  several 
significant  applications  in  image  analysis.  Its  application 
to  texture  analysis,  for  instance,  seems  particularly 
promising. 
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4.  Smart  Sensor  Projects 


This  section  represents  the  smart  sensor  phase  of 
research  funded  during  the  past  six  months.  Two  circuits 
have  been  fabricated  for  CCD  analog  near  focal  plane 
processing  to  implement  a variety  of  front  end  image 
processing  functions.  The  first  circuit  implements  the 
Sobel  operator  on  an  image.  This  represents  a nonlinear 
spatially  invariant  processor.  The  second  circuit  is 
designed  to  implement  both  nonlinear  spatially  invariant  as 
well  as  variant  processes.  Both  circuits  have  been 
fabricated.  The  Sobel  circuit  has  been  tested  with  success, 
(see  accompanying  section)  and  the  second  circuit  is  soon  to 
experience  testing. 


4.1  Charge-Coupled  Technology  for  Image  Understanding 
Graham  R.  Nudd 


Principal  participants  in  this  contract  during  this 
reporting  period  were  Jerry  Erickson,  Paul  Nygaard, 
Dale  Sipma,  Gary  Thurmond,  and  Graham  Nudd. 

Three  principal  tasks  have  been  undertaken  since  the 
last  report,  including: 

the  fabrication  and  testing  of  Test  Circuit  I; 

the  detailed  design,  layout  and  processing  of  Test 

Circuit  II;  and 

the  design  and  construction  of  our  test  facilities. 

We  have  made  significant  progress  in  each  of  these  areas. 
The  first  test  circuit  is  now  operating  on  test  images 
generated  on  the  USC  PDP-10  computer,  and  the  photographs  of 
the  processed  data  are  included  here,  together  with  the 
detailed  test  results.  The  second  test  circuit  has  now  been 
designed  and  initial  circuits  are  currently  available. 
Details  of  this  are  also  included.  Finally,  as  the  testing 
has  progressed,  we  have  made  considerable  modifications  to 
our  test  facilities  to  provide  more  flexibility  and  also 
added  a good  deal  of  software  to  interface  between  the 
microprocessor,  the  PDP-10  and  the  microprocessor  and  the 
CCD  circuits. 

Test  Circuit  £ 

Details  of  the  first  test  circuit,  a three  by  three 
Sobel  edge  detection  circuit,  have  been  given  in  the 
sesii-annual  report  dated  April  1977.  The  basic  algorithm 
operates  on  the  nine  picture  elements  shown  in  figure  1 and 
provides  the  output 
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S(t)  - | (a  + 2b+c)  - (g  + 2h  ♦ I)  | + | (a  + 2d  + g)  - (c  + 2f  + i)  | 


Figure  1.  Schematic  of  the  CCD  wbal  circuit. 


S (e)  - | (a+2b+c)  - (g+2h+i) | + | (a+2d+g)  - 

, (1) 

(c+2f+i) | 

It  performs  this  by  the  combination  of  two  circuit  elements, 
a two-dimensional  CCD  filter  and  an  absolute  value  circuit 
as  shown  in  figure  1.  The  two  dimensional  filter  operates 
on  three  parallel  lines  of  charge,  representing  three 
adjacent  lines  of  image  data  and  forms  the  four  outputs: 
(a+2b+c) , (-g-2h-i) , (a+2d+g) , and  (-c-2f-i).  It  does  this 
by  nondestruct ively  sensing  the  charge  using  a floating  gate 
structure  with  two  basic  gate  sizes  to  provide  the 
two-to-one  weighting  ratio.  Pairs  of  outputs  are  then 
combined  to  form  the  x and  y components  of  the  Sobel 
operator 


Sx  “ (a+2b+c)  + ( -g-2h- 1) ; and 

Sy  - (a+2d+g)  + (-c-2f-i)  (2) 

respectively.  Finally  these  two  components  are  used  as 
inputs  to  the  CCD  absolute  value  circuits  which  forms  (Sx 
and  Sy)  prior  to  addition,  providing  the  full  Sobel 
operation.  Two  absolute  value  circuits  have  been  included 
on  Test  Circuit  I as  described  in  the  April  1977  semi-annual 
report.  A schematic  of  the  first  circuit  is  shown  in 
figure  2.  It  consists  of  a modified  Tomp6ett  input  with  two 
input  gates  B1  and  SIG.  A reference  signal,  equivalent  to 
the  zero  level  of  the  absolute  value,  is  applied  to  B1  and 
the  input  signal  is  applied  to  SIG.  When  the  input  is  more 
negative  than  the  reference,  Bl,  a potential  well  of 
capacity  (Bl  - SIG)*eA/t  , coulombs  is  formed  where  A is 

OX 

the  gate  area,  t is  the  oxide  thickness  and  c is  the 
dielectric  constant.  Hence,  when  the  diffusion  is  pulsed, 
this  amount  of  charge  collects  under  gates  B2  and  FZ.  If, 
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however,  SIG  is  positive  with  respect  to  the  reference,  a 

charge  (SIG  - Bl)«eA/t  coulombs  collects  under  electrodes 

ox 

PZ  and  SIG.  Hence  if  the  area  of  all  the  electrodes  are 
equal,  a charge  equivalent  to  | (SIG  - Bl)-eA/tox|  is 
generated  in  either  case  and  an  absolute  magnitude  operation 
is  performed.  By  having  two  such  circuits  in  parallel,  the 
addition  shown  in  eq. (1)  is  performed  and  potentials  for 
this  circuit  are  shown  in  figure  3. 

Figure  2b  shows  a second  absolute  value  circuit  which 
requires  two  parallels  to  perform  a single  operation.  The 
potential  profiles  through  the  two  channels  are  shown  in 
figure  4.  The  technique  relies  on  any  voltage  change  in 
vsig  changing  the  potential  profile  under  the  SIG  electrode 
and  spilling  charge  * AVsigeA/tox  throu9h  only  one  channel. 
For  example,  as  shown  in  the  figure,  a positive  change  in 
will  create  a charge  flow  in  the  top  channel  only. 

A photograph  of  the  processed  circuit  is  shown  in 

figure  5.  It  consists  of  a two  phase  n-channel  CCD 

structure  with  approximately  3 x 25  stages  in  the  array.  In 

operation  charge  proportional  to  the  picture  intensity  in 

three  adjacent  lines  is  clocked  through  the  three  channels 

and  outputs  corresponding  to  the  Sobel  operator  are  formed 

on  the  absolute  value  circuit  output  shown  beneath  the  2D 

filter.  Because  the  charge  loss  per  stage  of  these 

3 

circuits  can  be  made  very  low,  less  than  one  part  in  10  , 
other  processing  functions  have  been  included  by  adding 
circuits  in  series.  In  this  case,  a high-pass  temporal 
filter,  a two-dimensional  Laplacian  operator  and  an  aperture 
corrector  have  been  included. 

During  this  period  of  the  contract,  the  initial  testing 
and  performance  analysis  of  the  circuit  has  been  performed. 
The  full  masking  process  consists  of  eight  levels  and  two 
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omissions  have  been  encountered  on  the  original  masks  which 
have  now  been  corrected  to  provide  working  parts.  The  first 
mask  error  was  an  open  circuit  on  one  of  the  clock  lines, 
<J>2»  which  prevented  charge  from  being  properly  clocked 
through  the  full  array.  Initially  attempts  were  made  to 
wire  bond  directly  to  the  clock  line,  but  this  proved  to  be 
very  difficult  and  the  results  were  unreliable,  and  a new 
mask  was  purchased.  Fortunately  this  error  occurred  on  the 
upper  level  and,  since  we  had  other  partially  processed 
wafers,  we  were  able  to  provide  corrected  circuits  in  about 
six  weeks.  A later  problem  occurred  on  absolute  value 
circuit  1 which  prevented  the  x component  of  the  Sobel  from 
operating.  This  was  an  unconnected  diffusion  which 
eliminated  the  charge  source  to  the  Tompsett  input.  Here  we 
were  able  to  use  a wire  bond  which  provided  satisfactory 
operation,  and  the  full  Sobel  circuit  is  now  operating. 
These  circuit  problems,  which  are  to  be  expected  in  the 
initial  runs  of  circuits  of  this  complexity,  have  caused  two 
to  three  months  delay  in  our  original  plans.  We  have  now 
however  established  the  operation  of  the  initial  concepts, 
and  a brief  outline  of  the  test  procedures  are  included. 

The  circuit  performance  has  been  evaluated  with  the 
computer  controlled  test  facilities  described  in  a later 
section.  In  concept,  image  data,  either  from  the  USC-IPI 
data  base  or  from  specially  prepared  test  patterns,  is 
accessed  via  a standard  30  characters/second  data  link  and 
stored  in  the  random  access  memory  (RAM)  of  the 
microcomputer  controlling  the  test  set-up.  Special  purpose 
CCD  drivers  and  clocks  have  been  built  to  drive  the  CCD 
circuits,  request  data  from  the  RAM  and  return  processed 
data  back  to  the  computer.  Details  concerning  the  timing 
and  data  flow  are  given  in  that  section. 


-181- 


The  initial  test  for  the  circuit  was  to  ensure  that 
charge  was  being  clocked  through  the  2D  filter  from  the 
Tompsett  input.  Accordingly,  the  voltage  in  each  input  gate 
for  the  three  parallel  arrays  was  varied  as  the  outputs  on 
Sobel  'x'  (Sx)  and  Sobel  'y'  (S^)  were  monitored. 

In  this  way,  the  effective  operation  of  the  top  and 
bottom  channels  can  be  monitored  using  a dc  voltage.  It 

should  be  noted  that  the  gate  connection  on  the  center 

channel  requires  a time  varying  waveform  to  test  its 
operation.  The  output  from  the  three  floating  gates  on  the 
top  channel  is  shown  in  figure  6a.  Note  that  here  three 
gates  with  weightings  1/2,  1 and  1/2  are  connected  so  the 

effective  output  has  a weighting  of  2.  Also  since  the 

weightings  are  all  positive,  the  output  has  the  same 
polarity  as  the  input  signal.  A schematic  of  the  waveform 
is  provided  in  figure  6b  for  reference.  The  linear  range  of 
this  operation  was  from  +5.5V  to  7.3V.  The  charge  storage 
for  a single  gate  is  given  by 

«s  * CoxVs  (3, 

where  CQX  ■ sA/tox  is  the  oxide  capacitance.  In  our  case, 

830  um2  and  hence  C - 0.27  pF,  resulting  in  a charge 

-120x 

storage  of  0.54  x 10  coulombs  for  a 2 volt  gate  swing. 

This  is  equivalent  to  3 x 10**  electrons.  The  noise 

associated  with  just  filling  this  well  is  Q * /KT  C , 

n ox 

which  is  equivalent  to  690  electrons.  In  practice,  other 
sources  of  noise  exist,  associated  with  the  trapping  states 
and  thermal  generation,  etc.,  which  might  increase  this 
number  to  approximately  two  or  three  thousand.  Even  so,  a 
dynamic  range  at  the  input  in  excess  of  eight  bit6  should  be 
attainable. 
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Similarly,  the  operation  of  the  lower  channel  is  shown 
in  figure  7a.  Note  the  weighting  on  this  channel 
corresponds  to  -1/2,  -1,  -1/2  or  -2  and  hence  the  polarity 
change.  The  linear  range  for  this  circuit  is  from  5.2V  to 
7.1V.  Figures  6 and  7 indicate  the  effective  clocking  of 
charge  through  the  channels.  The  desired  weightings  can 
only  be  analyzed  using  a time  varying  signal.  Because  of 
the  speed  limitation  on  the  INTEL  8080A  microprocessor,  a 
data  rate  of  approximately  5 kHz  was  used  in  these  tests. 
This  avoids  the  use  of  direct  memory  access,  which  will  be 
incorporated  in  the  next  phase  of  the  program. 

The  appropriate  weightings  for  the  Sobel  x and  y 
components  are  illustrated  in  figure  8,  together  with  the 
impulse  response  for  each  channel  when  measured  at  both  the 
x and  y outputs.  The  time  dependent  outputs  for  a single 
impulse  are  shown  in  figure  9,  from  which  it  can  be  seen 
that  there  are  three  functions  of  interest;  Sobel  "x", 
channel  1 and  3,  and  Sobel  "y".  The  output  waveforms  for 
channels  1 and  3 at  Sobel  "x"  and  channel  2 at  Sobel  "y",  in 
response  to  a single  impulse  of  one  picture  element  duration 
are  shown  in  figure  10.  It  can  be  seen  that  the  form  of  the 
weightings  is  current  although  the  accuracy  is  not 
sufficient  to  provide  operation  equivalent  to  eight  bits. 
We  believe  that  the  loss  of  accuracy  is  due  to  incomplete 
charge  transfer  through  the  full  array  and  we  are  currently 
investigating  techniques  to  improve  this.  In  particular, 
one  of  the  subsequent  circuits  in  the  array  (the  temporal 
high  pass  filter)  has  been  designed  to  operate  with  a 
clocking  waveform  which  is  slightly  shifted  in  phase  from 
the  prime  clocks,  and  adjustment  of  these  clocks  should  lead 
to  better  transfer.  At  present,  we  are  operating  all  clocks 
from  two  phases. 


I 
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Figure  10.  Measured  impulse  response  of  three  channels 


-188- 


To  provide  the  full  Sobel  output,  the  absolute 
magnitude  of  each  component  must  be  taken  and  the  output 
summed,  using  the  CCD  absolute  value  circuit  described 
above.  An  example  of  the  operation  of  this  circuit  is  given 
in  figure  11.  Here  only  the  Sobel  "y"  component  is  used  for 
illustration  and  the  input  to  the  array  consists  of  an 
impulse  of  one  pixel  duration.  This  results  in  a positive 
going  signal,  representing  the  transition  at  the  leveling 
edge  of  the  pulse,  zero  output  when  the  discontinuity  is  at 
the  center  of  the  array  (i.e.,  at  location  "e")  and  a 
negative  signal  corresponding  to  the  transition  on  the 
trailing  edge,  as  shown  in  the  upper  trace.  The  positive 
and  negative  pulses  in  effect  locate  the  edges  of  the 
discontinuity.  The  Sobel  operator,  however,  requires  that 
both  edges  have  the  same  polarity.  This  corresponds  to  the 
output  shown  in  the  lower  trace,  where  both  signals  are 
shown  as  negative  going  outputs.  The  polarity  is  merely 

convention  and  the  negative  outputs  are  a direct  result  of 
an  n-channel  device  operating  with  electrons  as  the 

carriers.  The  form  of  this  lower  waveform  corresponds 

precisely  to  the  full  Sobel  algorithm  for  a single 
discontinuity.  In  terms  of  an  optical  image,  the  input 
would  correspond  to  a vertical  grid,  each  element  having  a 
width  equal  to  one  picture  element,  and  spaced  at  the  pulse 
repetition  interval.  The  output  is  equivalent  to  a grid 
pattern  of  double  lines  representing  the  edges  of  the 
original  pattern. 

The  two  components  of  the  Sobel  operator  have  been 

tested  using  patterns  with  both  vertical  and  horizontal 
symmetry.  Specific  examples  of  this  are  given  in  figures  12 
and  13.  In  figure  12a,  vertical  grid  of  lines  each  of  which 
is  one  pixel  wide  is  shown  as  the  unprocessed  image.  The 
processed  image  (figures  12b)  consists  of  pairs  of  vertical 
lines  spaced  by  a single  picture  element  width  and  represent 
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Figure  12.  Example  of  CCD  Sobel  operation  on  test  pattern  with  vertical  symmetry 
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Figure  13.  Example  of  CCD  Sobel  operation  on  test  pattern  with  horizontal  symmetry 
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the  output  from  the  CCD  Sobel  circuit  operated  at 
approximately  5 kHz.  For  comparison,  the  computer  generated 
Sobel  is  shown  in  figure  12c.  Figure  13  shows  the 
corresponding  results  for  a horizontal  grid,  thereby  testing 
Sobel  "x". 

These  simple  patterns  serve  to  verify  the  CCD  Sobel 
operation  and  provide  data  for  the  circuit  analysis.  Our 
present  work  in  this  area  is  concentrated  in  three  specific 

areas: 

testing  the  circuit  on  a more  extensive  base; 
increasing  the  dynamic  range  and  speed  of  the  present 
devices,  and  providing  a more  rapid  data  transfer 
between  the  microcomputer  and  the  CCD  circuits. 

Test  Circuit  II 

The  aim  of  the  second  test  circuit  is  to  investigate 
the  possibility  of  performing  adaptive  algorithms;  in  this 
case,  based  largely  on  the  local  mean.  Again  a kernel  of 
three  by  three  picture  elements  is  used  and  the  five 
algorithms  included  are: 

Sobel  Edge  Detection 

S(e)  - i/8< | («+2b+c)  - <f+2h+l)| 

+ | (e+2d+g)  - <c+2f+l)|) 

Low  Pass  Filter inq 

Fm  - l/9(a  + b + c + d + e + f + g + h+l) 

Unsharp  Ma sk  1 nq 

FUSM  “ <i-«>s<t,j)  ♦ «F« 


-193- 


Adaptive  Blnar lit r 


Fm  < r/2 
Fm  > r/2 

Because  of  the  complexity  of  building  all  of  these  functions 
in  a single  circuit,  a modular  approach  has  been  employed  as 
shown  in  figure  14.  Here  each  circuit  is  built  as  a 
separate  unit  on  a single  chip  and  the  full  operation  can  be 
obtained  by  using  either  wire  bonds  at  the  crystal  surface 
or  coaxial  interconnects.  The  design  rules  for  thi>»  circuit 
are  given  in  Table  1 and,  where  appropriate,  the  performance 
of  the  circuit  elements  has  been  computer  simulated  using 
the  "SPICE"  simulation  programs.  The  detailed  design  and 
layout  of  this  circuit  is  now  completed  and  the  first  lost 
has  been  processed.  A photograph  of  the  processed  parts  is 
shown  in  figure  15.  Schematics  of  each  circuit  concept  were 
included  in  the  previous  semi-annual  report.  We  anticipate 
that  preliminary  testing  and  performance  evaluation  will  be 
completed  in  the  next  two  months.  The  circuits  themselves 
will  be  bonded  in  forty-eight  pin  dual-in-line  packages 
identical  to  that  used  for  Test  Circuit  I and  it  should  be 
possible  to  use  the  drivers  and  computer  interfaces  already 
developed.  However  to  achieve  the  adaptive  processing  with 
all  circuit  elements,  it  will  be  necessary  to  initially 
provide  external  coaxial  interconnects  between  several 
break-out  boxes.  Then,  as  testing  proceeds,  internal  bonds 
at  the  chip  surface  will  be  employed  to  enable  all  functions 
to  be  obtained  from  a single  circuit.  We  anticipate  that 
further  development  of  the  peripheral  circuitry  such  as 
clocks,  rest  snd  diffusion  pulses  necessary  to  operate  this 
circuit  will  be  relatively  minor. 


F . I1'"'' 
b lo  fm  > , 

Adaptive  Stretching 

12  Min  |e,  r/2| 

Fa  ’ 

a « 2 Max  | (e-r/2) , 0| 
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STARTING  MATERIAL  10  OHM-CM  P-TYPE  (100) 
N-CHANNEL  1000  l GATE  OXIDE  WITH  SELF-ALIGN  IMPLANT 


LAYOUT  RULES  (MILS) 

CHANNEL  STOP  (CS)  P+  DIFFUSION 
SOURCE/DRAIN  (SD)  N+  DIFFUSION 
GATE  OXIDE  (TO) 

POLYSILICON  1 (PS1) 

PS1  .1  OUTSIDE  SD  (WHEN  SA  IMPLANT  USED) 
PS1  .075  INSIDE  SD  (NON  SA) 

PS1  .125  OVERLAP  CS 
POLYSILICON  2 (PS2) 

PS2  .1  OVERLAP  PS1 

THRESHOLD  SHIFT  (TS)  UNDER  PS2  IN  CCDS 
IMPLANT,  BORON 
SELF-ALIGN  IMPLANT  (SA) 

CONTACT  (CT) 

CT  .125  INSIDE  PS1 
CT  .15  INSIDE  PS2 
CT  .10  INSIDE  SD 
CT  .10  OUTSIDE  PS1 
rT  .15  OUTSIDE  PS2 
METAL  fAL) 

AL  .15  OVERLAP  CT 
PAD 


MIN.  WIDTH  SPACE 


.30 

.2  x .2 


.3  TO  CS 
.025  INSIDE  SD 
.125  TO  PS1 


.125 


y x 4.0  7.0  CENTER  TO  CENTE 


DEVICES:  L = .30,  MIN  W = .2 


PS1  L = .35 
PS2  L * .375 

1.05  MILS/BIT 


GAP  = .175 
W = 2.4 


Table  1.  Design  Rules  for  Test  Circuit  II 
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Figure  15.  Photograph  of  test  circuit  II  (size  approximately  190  x 190  mils^) 
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Development  of  Test  Facilities 

In  addition  to  the  development  of  Test  Circuits  I and 
II,  we  have  made  considerable  progress  with  our  test 
facilities,  and  most  of  the  necessary  equipment  has  been 
designed  and  built.  The  CCD  test  facility  has  been  set  up 
to  accomplish  two  main  objectives: 

1.  Provide  the  necessary  signals,  dc  levels,  and 
adjustments  for  functionally  testing  the 
operation  of  the  CRC-111  and  CRC-115  chips. 

2.  Demonstrate  the  operation  of  these  chips  on  real 
images. 

The  following  tasks  have  been  done  to  accomplish  these 
objectives: 

- A box  was  built  to  hold  the  48  pin  chips  and  connect 
their  leads  to  48  bnc  outlets. 

- Two  boxes  were  built  to  provide  12  independent, 
adjustable  dc  levels. 

- A "CCD  driver"  box  was  built  to  provide  the  necessary 
periodic  waveforms  required  for  operating  the  chips. 
These  waveforms  include  two  clocks,  two  resets,  two 
diffusion  signals  and  clocking  signals.  These  signals 
have  adjustments  for  gain,  offset  and  timing  for 
optimizing  the  operation  of  the  chips. 

- The  microprocessor  (IMSAI  8080)  was  interfaced  with 
the  "CCD  driver"  box  so  that  it  can  output  three  analog 
channels  and  read  in  one  analog  channel  synchronously 
with  the  "CCD  driver"  box.  Thus  data  in  computer 
memory  can  be  formatted  into  three  parallel  output 
channels,  fed  to  the  CCD  chips,  and  the  resulting 
operation  (Sobel,  median,  etc.)  can  be  read  and  stored 
back  into  computer  memory. 

- a display  with  16  grey  levels  and  a 128  x 128  pixel 
resolution  was  built  for  the  microprocessor.  Each 
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128  xl28  x 4 bit  picture  occupies  8K  of  8080  memory. 
Pictures  and  programs  are  stored  permanently  on 
cassette  tape. 

- A means  of  transmitting  a picture  from  the  USC  time 
sharing  system  to  the  microprocessor  was  developed.  To 
accomplish  this,  a version  of  the  Scene  Analysis  System 
used  by  Hughes  has  been  modified  to  contain  a command 
which  will  encode  a picture  into  ASCII  and  output  it  on 
the  terminal.  The  microprocessor  then  intercepts  the 
picture  as  it  is  transmitted. 

- Several  programs  for  the  microprocessor  test 
facilities  have  been  written.  These  include  the 
following : 

A.  CALIB:  This  program  is  used  for  calibrating  the 

input  and  output  analog  channels  of  the 
microprocessor . 

B.  BLOCK:  This  program  does  the  Sobel  operation  on 

a 3x3  block  of  data  using  the  CRC-111 
chip. 

C.  PACK2 : This  program  intercepts  a 128  x 128  x 4 

bit  picture  from  USC. 

D.  SOBEL:  This  program  does  a Sobel  operation  on  a 

128  x 128  bit  picture  using  the  CRC-111 
chip.  The  results  are  displayed  on  the 
128  x 128  x 4 bit  display. 

E.  CSOBL:  This  program  calculates  exactly  the 

Sobel  operator  on  a 128  x 128  x 4 bit 
picture.  The  results  are  displayed  on 
the  128  x 128  x 4 bit  display  and  can  be 
compared  with  the  results  of  program 
SOBEL. 

A block  diagram  of  the  teat  facilities  are  shown  in 
figure  16.  Also  a number  of  teat  imagea  have  been  generated 
to  provide  data  of  the  circuits  performance. 
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Conclusion 

Significant  progress  has  been  made  in  three  areas:  the 
fabrication  and  test'  7 on  Test  Circuit  I,  the  design  and 
fabrication  of  Test  Cj*.  it  II,  and  the  development  of  the 
computer  based  test  1'  cilities.  At  the  present  time.  Test 
Circuit  I has  been  tested  on  a number  of  geometric  test 

images  providing  proof  of  concept.  More  extensive  tests 

will  continue  on  this  circuit  in  the  next  period.  The 
detailed  design,  layout  and  fabrication  of  the  second 

circuit  is  now  complete  (as  shown  in  figure  15)  and 
preliminary  evaluation  has  begun.  The  test  facilities  have 
expanded  to  provide  the  appropriate  interface  between  the 
USC  PDP-10  and  our  CCD  circuits  and  provide  all  the 
necessary  clocks  and  waveforms  to  operate  both  test 

circuits.  During  the  next  period,  we  expect  to  complete  the 
testing  of  both  circuits  and  provide  facilities  to  operate 
them  at  real-time  (TV)  rates  as  well  as  investigate  new 
concepts  for  our  next  circuit  development. 


Recent  Ph.D.  Dissertations 


This  section  includes  those  dissertations  completed 
since  the  last  reporting  period.  The  ones  listed  here 
reflect  results  in  two  areas  of  our  image  understanding 
project.  The  first  report  describes  a joint 
detection-estimation  approach  to  boundary  estimation.  The 
methods  of  nonlinear  estimation  theory  are  applied  to  the 
detection  of  objects  in  extremely  poor  signal -to- noise  ratio 
images  with  surprisingly  good  results.  The  second  report 
describes  the  subject  of  image  segmentation  by  clustering. 
This  work  utilizes  the  methodologies  of  pattern  recognition, 
clustering,  principal  components  analysis,  feature  selection 
and  image  pre-processing  to  automatically  segment  images 
into  homogeneous  regions. 

Results  of  the  above  two  research  topics  ate  described 
in  detail  in  USCIPI  Report  760  and  USCIPI  Report  750, 
respectively. 


5.1  A Joint  Detection-Estimation  Approach  to  Boundary 
Estimation 


Simon  Lopez -Mora 

The  estimation  of  object  boundaries  based  on  noisy 
observations  is  considered  in  the  context  of  joint  detection 
and  estimation. 

The  images  are  expressed  as  replacement  processes  and 
the  boundaries  modelled  in  terms  of  geometrical  parameters 
associated  with  the  object.  The  images  studied  have  two 
textures,  object  and  background,  characterized  by  their 
first  and  second  order  statistics.  A boundary  processor 
consisting  of  optima  estimator  and  detector  is  derived,  for 
an  appropriately  chosen  cost  function.  Differences  between 
the  cost  function  and  resultant  processor  with  other  costs 
and  estimator-detector  pairs  used  previously  in  other 
applications  is  indicated.  The  optimal  solution  involves  a 
nonlinear  estimator  and  a detector  with  a variable  threshold 
dependent  on  the  estimator  output. 

Further,  because  of  information  restrictions  imposed  on 
the  estimator  that  alleviate  its  computational  requirements, 
a recursive,  easily  implementable  algorithm,  updating  only 
the  first  two  moments  is  derived,  and  subsequently  used  to 
evaluate  the  estimate  as  well  as  to  perform  detection. 

Experimental  results  are  illustrated.  Of  particular 
significance  is  the  applicability  of  said  processor  under 
very  low  signal  to  noise  ratio  conditions. 
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5.2  Image  Segmentation  by  Clustering 
Guy  B.  Coleman 


The  segmentation  of  imagery  into  homogeneous  regions 
using  digital  techniques  has  been  a goal  of  researchers  for 
the  past  several  years.  Pattern  recognition  approaches 
using  mathematical  models  have  achieved  results  which  are 
only  partially  satisfactory.  The  large  dimension  of  the 
pattern  space  and  the  quantity  of  data  involved  in  the 
digital  representation  of  images  are  in  part  responsible  for 
the  limited  applicability  of  these  approaches.  Other 
shortcomings  are  related  to  the  demands  for  data  with  which 
to  train  the  classifier. 

Approaches  based  on  linguistic  models  have  also  been 
tried,  again  with  results  which  are  partially  satisfactory. 
The  most  serious  shortcomings  are  related  to  the  performance 
of  these  approaches  in  the  presence  of  noise,  a phenomenon 
with  which  man  has  learned  to  function  effectively. 

This  dissertation  describes  a procedure  for  segmenting 
imagery  using  digital  techniques  and  is  based  on  the 
mathematical  model.  The  classifier  does  not  require 
training  prototypes,  that  is,  it  operates  in  an 
"unsupervised”  mode.  The  procedure  is  general  in  that  the 
features  most  useful  for  the  particular  image  to  be 
segmented  are  selected  by  the  algorithm.  The  algorithm 
operates  without  any  human  interaction. 

The  features  used  are  based  on  brightness  and  texture 
in  regions  centered  on  every  picture  element  in  the  image. 
To  perform  an  elementary  pre-classification  of  local 
regions,  a filter  based  on  the  mode  of  the  local  area 
histogram  is  proposed  and  used  in  segmenting  images. 
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The  basic  procedure  is  a K-means  clustering  algorithm 
which  converges  to  a local  minimum  in  the  average  squared 
inter-cluster  distance  for  a specified  number  of  clusters. 
The  algorithm  iterates  on  the  number  of  clusters,  evaluating 
the  clustering  based  on  a parameter  of  clustering  quality. 
The  parameter  proposed  is  a product  of  between  and  within 
cluster  scatter  measures,  which  achieves  a maximum  value 
that  is  postulated  to  represent  an  intrinsic  number  of 
clusters  in  the  data. 

It  has  been  impossible  in  the  past  to  compare  different 
segmentations  of  the  same  image.  A comparison  measure  based 
on  the  joint  histogram  of  the  two  segmentations  is  proposed 
and  examples  of  its  use  are  presented. 

It  is  within  the  state  of  the  art  to  adapt  the 
segmentation  procedure  described  herein  to  operate  in 
hardware  at  television  rates.  A functional  diagram  of  such 
a system  is  presented,  and  estimates  of  the  required 
capacities  are  given. 
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